Modelos de Fragilidad en El Análisis de Supervivencia PDF
Modelos de Fragilidad en El Análisis de Supervivencia PDF
com
www.Ebook777.com
Free ebooks ==> www.Ebook777.com
Frailty Models in
Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Editor-in-Chief
Series Editors
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Published Titles
1. Design and Analysis of Animal Studies in 19. Statistical Design and Analysis of Stability
Pharmaceutical Development, Studies, Shein-Chung Chow
Shein-Chung Chow and Jen-pei Liu 20. Sample Size Calculations in Clinical Research,
2. Basic Statistics and Pharmaceutical Statistical Second Edition, Shein-Chung Chow,
Applications, James E. De Muth Jun Shao, and Hansheng Wang
3. Design and Analysis of Bioavailability and 21. Elementary Bayesian Biostatistics,
Bioequivalence Studies, Second Edition, Revised Lemuel A. Moyé
and Expanded, Shein-Chung Chow and 22. Adaptive Design Theory and Implementation
Jen-pei Liu Using SAS and R, Mark Chang
4. Meta-Analysis in Medicine and Health Policy, 23. Computational Pharmacokinetics, Anders Källén
Dalene K. Stangl and Donald A. Berry 24. Computational Methods in Biomedical Research,
5. Generalized Linear Models: A Bayesian Ravindra Khattree and Dayanand N. Naik
Perspective, Dipak K. Dey, Sujit K. Ghosh, 25. Medical Biostatistics, Second Edition,
and Bani K. Mallick A. Indrayan
6. Difference Equations with Public Health 26. DNA Methylation Microarrays: Experimental
Applications, Lemuel A. Moyé and Design and Statistical Analysis,
Asha Seth Kapadia Sun-Chong Wang and Arturas Petronis
7. Medical Biostatistics, Abhaya Indrayan and 27. Design and Analysis of Bioavailability and
Sanjeev B. Sarmukaddam Bioequivalence Studies, Third Edition,
8. Statistical Methods for Clinical Trials, Shein-Chung Chow and Jen-pei Liu
Mark X. Norleans 28. Translational Medicine: Strategies and
9. Causal Analysis in Biomedicine and Statistical Methods, Dennis Cosmatos and
Epidemiology: Based on Minimal Sufficient Shein-Chung Chow
Causation, Mikel Aickin 29. Bayesian Methods for Measures of Agreement,
10. Statistics in Drug Research: Methodologies and Lyle D. Broemeling
Recent Developments, Shein-Chung Chow 30. Data and Safety Monitoring Committees in
and Jun Shao Clinical Trials, Jay Herson
11. Sample Size Calculations in Clinical Research, 31. Design and Analysis of Clinical Trials with Time-
Shein-Chung Chow, Jun Shao, and to-Event Endpoints, Karl E. Peace
Hansheng Wang 32. Bayesian Missing Data Problems: EM, Data
12. Applied Statistical Design for the Researcher, Augmentation and Noniterative Computation,
Daryl S. Paulson Ming T. Tan, Guo-Liang Tian, and Kai Wang Ng
13. Advances in Clinical Trial Biostatistics, 33. Multiple Testing Problems in Pharmaceutical
Nancy L. Geller Statistics, Alex Dmitrienko, Ajit C. Tamhane,
14. Statistics in the Pharmaceutical Industry, and Frank Bretz
Third Edition, Ralph Buncher and Jia-Yeong Tsay 34. Bayesian Modeling in Bioinformatics,
15. DNA Microarrays and Related Genomics Dipak K. Dey, Samiran Ghosh, and
Techniques: Design, Analysis, and Interpretation Bani K. Mallick
of Experiments, David B. Allsion, Grier P. Page, 35. Clinical Trial Methodology, Karl E. Peace
T. Mark Beasley, and Jode W. Edwards and Ding-Geng (Din) Chen
16. Basic Statistics and Pharmaceutical Statistical 36. Monte Carlo Simulation for the Pharmaceutical
Applications, Second Edition, James E. De Muth Industry: Concepts, Algorithms, and Case
17. Adaptive Design Methods in Clinical Trials, Studies, Mark Chang
Shein-Chung Chow and Mark Chang 37. Frailty Models in Survival Analysis,
18. Handbook of Regression and Modeling: Andreas Wienke
Applications for the Clinical and Pharmaceutical
Industries, Daryl S. Paulson
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
www.Ebook777.com
Free ebooks ==> www.Ebook777.com
Frailty Models in
Survival Analysis
Andreas Wienke
Institute of Medical Epidemiology, Biostatistics, and Informatics
Martin-Luther-University Halle-Wittenberg, Germany
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to
copyright holders if permission to publish in this form has not been obtained. If any copyright material has
not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmit-
ted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented,
including photocopying, microfilming, and recording, or in any information storage or retrieval system,
without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.
com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and
registration for a variety of users. For organizations that have been granted a photocopy license by the CCC,
a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation without intent to infringe.
Wienke, Andreas.
Frailty models in survival analysis / author, Andreas Wienke.
p. cm. -- (Chapman & Hall/CRC biostatistics series)
“A CRC title.”
Includes bibliographical references and index.
ISBN 978-1-4200-7388-1 (hardcover : alk. paper)
1. Failure time data analysis--Mathematics. 2. Survival analysis
(Biometry)--Mathematics. 3. Mortality--Mathematical models. 4.
Demography--Mathematics. I. Title. II. Series.
QA280.W54 2011
519.5’46--dc22 2010021869
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
www.Ebook777.com
Free ebooks ==> www.Ebook777.com
Contents
List of Figures xv
Preface xix
1 Introduction 1
1.1 Goals and Outline . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Survival Analysis 15
2.1 Basic Concepts in Survival Analysis . . . . . . . . . . . . . . 15
2.2 Censoring and Truncation . . . . . . . . . . . . . . . . . . . 19
2.3 Parametric Models . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1 Exponential distribution . . . . . . . . . . . . . . . . . 28
2.3.2 Weibull distribution . . . . . . . . . . . . . . . . . . . 30
2.3.3 Log-logistic distribution . . . . . . . . . . . . . . . . . 32
2.3.4 Gompertz distribution . . . . . . . . . . . . . . . . . . 33
2.3.5 Log-normal distribution . . . . . . . . . . . . . . . . . 34
2.3.6 Gamma distribution . . . . . . . . . . . . . . . . . . . 36
2.3.7 Pareto distribution . . . . . . . . . . . . . . . . . . . . 37
2.4 Estimation of Survival and Hazard Functions . . . . . . . . . 38
2.4.1 Kaplan–Meier estimator . . . . . . . . . . . . . . . . . 38
2.4.2 Nelson–Aalen estimator . . . . . . . . . . . . . . . . . 41
2.5 Regression Models . . . . . . . . . . . . . . . . . . . . . . . . 43
2.5.1 Proportional hazards model . . . . . . . . . . . . . . 43
2.5.2 Accelerated failure time model . . . . . . . . . . . . . 50
2.6 Identifiability Problems . . . . . . . . . . . . . . . . . . . . . 52
ix
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
x
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
xi
A Appendix 243
A.1 Bivariate Lifetime Models . . . . . . . . . . . . . . . . . . . . 243
A.2 Correlated Gamma Frailty Model . . . . . . . . . . . . . . . 245
A.3 Correlated Compound Poisson Frailty Model . . . . . . . . 247
A.4 Correlated Quadratic Hazard Frailty Model . . . . . . . . . 249
A.5 Dependent Competing Risks Model . . . . . . . . . . . . . . 253
A.6 Quantitative Genetics . . . . . . . . . . . . . . . . . . . . . 261
References 265
Index 299
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
www.Ebook777.com
Free ebooks ==> www.Ebook777.com
List of Tables
xiii
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
xiv
4.1 Parametric shared gamma frailty models for Halluca data . . 142
4.2 Shared gamma frailty model for Halluca data . . . . . . . . . 144
4.3 Simulation of shared gamma frailty in current status data I . 146
4.4 Simulation of shared gamma frailty in current status data II . 147
4.5 Analysis of hepatitis A and B current status data with the
shared gamma frailty model . . . . . . . . . . . . . . . . . . . 147
4.6 Parametric shared log-normal frailty models for Halluca data 148
4.7 Shared log-normal frailty model for Halluca data . . . . . . . 149
6.1 Correlated gamma frailty and copula models for Danish twins 214
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
List of Figures
xv
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
www.Ebook777.com
Free ebooks ==> www.Ebook777.com
Symbol Description
ACE genetic model M0 generic symbol for a
AFT accelerated failure cumulative baseline
time hazard
AIC Akaike information log N (m, s2 ) log-normal distribution
criterion with parameters m, s2
AR autoregressive process MCEM Markov Chain EM
BMI body mass index MCMC Markov Chain Monte
BLUB best linear unbiased Carlo
predictor ML maximum likelihood
CAD coronary artery MZ monozygotic
disease NSCLC non-small cell lung
CSRF corrected scale reduc- carcinoma
tion factor P(A) probability of event A
CHD coronary heart disease pdf probability density
cdf cumulative density function
function PH proportional hazards
DIC Bayesian information PPL penalized partial likelihood
criterion PVF power variance function
DZ dizygotic S generic symbol for a
E expectation survival function
EBCT electron-beam com- t+ truncation time
puted tomography TNM classification of malig-
ECOG Eastern Cooperative nant tumours
Oncology Group V variance
EM expectation-maximi- U (a, b) uniform distribution in
zation the interval [a, b]
f generic symbol for pdf N (µ, σ 2 ) normal distribution
F generic symbol for cdf with parameters µ, σ 2
Γ gamma function Exp(λ) exponential distribution
H0 null hypothesis with parameter λ
HA alternative hypothesis W (λ, ν) Weibull distribution
ICD International Classifica- with parameters λ, ν
tion of Diseases G(λ, ϕ) Gompertz distribution
iid independent and with parameters λ, ϕ
identically distributed Γ(k, λ) gamma distribution
L Laplace transform with parameters k, λ
L likelihood function log L(ν, κ) log-logistic distribution
µ generic symbol for a with parameters ν, κ
hazard P s(α) positive stable distribution
µ0 generic symbol for a with parameter α
baseline hazard cP (γ, k, λ) compound Poisson
M generic symbol for a distribution with
cumulative hazard parameters γ, k, λ
xvii
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
www.Ebook777.com
Free ebooks ==> www.Ebook777.com
Preface
xix
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
xx
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
xxi
To keep the book to a reasonable length, some topics are discussed only
briefly, and references are given for further reading. Because the literature
on frailty models is extensive (especially in the last few years), the choice of
subject matter is difficult. The material discussed in detail is to some extent a
reflection of the author’s interest in this research field. However, my attempt
has been to present a relatively comprehensive and complete overview of the
fundamental approaches in the field of frailty models.
The present monograph is primarily aimed at the biostatistical community
with applications from biomedicine, (genetic) epidemiology, and demography.
Some efforts were also undertaken to include literature from other fields like
econometrics if interesting methodological problems are raised. The practical
use of models is a key issue in biostatistics, where the data at hand often
are motivating for the development of new models. The language of this
book is nontechnical and therefore it can be understood by nonspecialists.
Nevertheless, some experience with survival analysis is an advantage.
Acknowledgments
I would like to express my sincere thanks to everyone who supported directly
or indirectly this book; my former Ph.D. advisor Friedrich Liese (Rostock) for
awakening my interest in statistical research, Anatoli Yashin and Konstantin
Arbeev (Duke), Paul Janssen and Niel Hens (Hasselt), Catherine Legrand
(Louvain-la-Neuve), Alexander Begun (Hamburg), Nicole Giard (Gütersloh),
Kaare Christensen and Ivan Iachine (Odense), Isabella Locatelli (Lausanne),
Slobodan Zdravkovic (Kopenhagen), Samuli Ripatti (Helsinki), Oliver Kuß
(Halle), and Juni Palmgren (Stockholm) for many helpful discussions and
fruitful collaboration. I take the opportunity to thank all my colleagues at
the Institute for Medical Epidemiology, Biostatistics, and Informatics at the
Medical Faculty of the Martin Luther University Halle-Wittenberg, especially
Johannes Haerting as head of the institute, for providing an excellent research
environment. I would like to thank the Danish and Swedish Twin Registries
for making the unique twin data available. I extend my sincere thanks to the
Max Planck Institute for Demographic Research (Rostock) and its founding
director, James Vaupel, where I began to work on frailty models and where
I returned several times as a guest researcher during the last years. Special
thanks go to Luc Duchateau (Ghent) for careful reading an earlier version
of the manuscript and giving many very helpful suggestions. Furthermore, I
would like to thank my Ph.D. students Katharina Hirsch and Diana Pietzner
(Halle) for many fruitful discussions and help in preparing the manuscript.
Finally, acknowledgments go to my family. To my parents Margit and Kurt
Wienke for all their love and support. Also, to my sons Moritz, Jakob, and
Finn, showing that there is a life beyond research and, finally, to my loving
wife Kati Moeller, who always strongly encouraged me to finish this book.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Chapter 1
Introduction
1
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
2 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Introduction 3
1.2 Examples
Different survival models are considered in this book. Most of them will
be applied to real data, mainly using examples from research fields such as
medicine, epidemiology, and demography. Survival analysis deals with the
analysis of times until the occurrence of a well defined event. The occurrence
of this event describes the transition from one state to another, for example,
occurrence of a disease is the transition from the state of being healthy to the
state of being sick. Sometimes the transition is of special interest (incidence
of the disease), and in other cases the state (prevalence of the disease) is the
target of the analysis. For such kind of analysis it is necessary to define the
time scale and a starting time point zero. In many cases the time scale is the
age of the individual. In clinical trials the starting point is often beginning
of treatment. If the focus is on the development of a disease, the time of
diagnosis is usually the starting point. In occupational cohort studies the
starting point is often the beginning of employment or unemployment.
We first consider the univariate event times, which means data with no
clustering. Such data set is given in Example 1.1, based on a prognostic study
analyzing the value of electron-beam computed tomography (EBCT) derived
calcium scores for risk stratification in symptomatic patients. Example 1.2
presents the malignant melanoma data. The models fitted to these data sets
are parametric and semiparametric proportional hazard models. The most
important goal of Chapter 3 is to analyze the effect of including unobserved
heterogeneity on regression parameter estimates.
However, the main focus of this book is on multivariate frailty models, where
event times are clustered. Example 1.3 will serve as an example of univariate
as well as multivariate data. In the last situation the cancer-diagnosing units
were considered as clusters. The cluster size differs from cluster to cluster,
which is common in multicenter clinical trials. Here, frailty can describe
center-to-center variations not explained by observed covariates. In addition
to the problem of analyzing the effect of observed covariates, an important
research problem is evaluating the dependence between event times in clusters.
In genetic studies, correlations between family members are the basis of the
analysis of heritability of specific traits, for example, the times of onset of
breast cancer or cause of death specific lifetimes. We use Danish and Swedish
twin data provided by the Danish Twin Registry at the University of Southern
Denmark in Odense and the Swedish Twin Registry at the Karolinska Institute
in Stockholm to emphasize the practical purpose of the frailty models with
fixed and small cluster sizes. In Example 1.4, cause-specific lifetimes of Danish
twins are considered. A subsample with additional covariate information is
presented in Example 1.5. Example 1.6 provides data on the age of onset
of breast cancer in Swedish twins, whereas Example 1.7 deals with current
status data. The next section provides a brief description of these data.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
4 Frailty Models in Survival Analysis
1 42 0 2 0 70
2 42 0 1 0 59
3 42 0 1 1 74
4 14 1 4 1 70
5 42 0 1 0 50
Four clinical risk groups with increasing evidence of CAD were constructed
based on risk factor assessment, exercise stress testing, coronary angiographic
anatomy, and revascularization at baseline. The main interest was in the
occurrence of a combined event consisting of major adverse cardiac events
such as myocardial infarction, cardiac death, and revascularization. The event
was observed in 40 (16%) patients during the follow-up, the observations of
the other patients are mainly censored after 42 months at the end of the study.
The data for five patients are presented in Table 1.1.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Introduction 5
The first column gives the patient specific identification number, the observed
event or censoring time in the second column is measured in months. The
covariate of main interest in this study is the CAD risk divided into four
prognostic groups (group 1 – no evidence of ischemia, ≤ 1 conventional risk
factor; group 2 – evidence of ischemia and/or ≥ 2 conventional risk factors, no
angiographic stenoses; group 3 – angiographic stenoses, no revascularization
at baseline; group 4 – early revascularization). The dichotomous covariate
calcium indicates an EBCT-derived calcium score larger than 100. There is
no clustering in this data set. One of the research questions was to examine
whether the EBCT-derived calcium score can add prognostic information
compared with the clinical information summarized in the risk groups. Age
(in years) was categorized into three groups with the youngest age group as
the reference. The covariate frequencies are given in Table 1.2.
1 10 0 1 76 676
2 30 0 1 56 65
3 35 0 1 41 134
4 99 0 0 71 290
5 185 1 1 52 1208
The first column provides the unique patient identification number. Variable
time measures time since surgery in months and variable status indicates
the occurrence of death caused by malignant melanoma. There are several
covariates available in the data set; for ease of presentation we will restrict
them in this application to the following three covariates: gender (0 = female,
1 = male), age at surgery (years), and tumor thickness (in 1/100 mm). This
is a univariate data set without clustering. The data was first analyzed by
Drzewiecki et al. (1980a,b) and later published and reanalyzed in more detail
by Andersen et al. (1993).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
6 Frailty Models in Survival Analysis
1 54.74 1 1 1 75.02 1 0 4
2 6.68 1 1 1 63.60 1 0 5
3 0.33 1 1 1 52.68 2 . .
4 24.28 1 2 1 55.14 . 0 .
5 15.46 0 2 0 79.28 2 0 1
the patient-specific id number, and the second column the survival time (in
months). The third column contains the survival status (1 = death, 0 = alive)
and the fourth column the cluster variable diagnosing unit. Lung cancer was
diagnosed in 56 different diagnosing units with numbers of patients ranging
from 1 to 392 (mean 30.3). The Halluca data is analyzed using univariate
approaches in Chapters 2 and 3. In Chapter 4 the data is treated like from
a multicenter study with the diagnosing unit as cluster variable. A cluster
effect by diagnosing unit is indicated by Figure 1.1. In multicenter studies
(with treatment center as cluster), despite the tight study protocols, often
center-to-center variation occurs, which cannot be explained by covariates.
Frailty models can be used to investigate this variation. The other columns
represent variables gender (0 = female, 1 = male), age (years), histologic type
(1 = small-cell lung cancer, 2 = non-small-cell lung cancer), ECOG status
(range 0 to 4), and UICC stage (1 = I, 2 = II, 3 = IIIa, 4 = IIIb, 5 = IV).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Introduction 7
500
400
●
● ●
●
●
300
●
●
● ●
●
days
● ●
●
200
● ●
●
● ●
100
●
0
5 10 15 20 25
diagnosing units
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
8 Frailty Models in Survival Analysis
1 76.09 1 1 1 1 2 1889
2 76.02 1 1 1 1 2 1889
3 64.73 1 2 0 2 4 1908
4 94.75 1 2 0 2 1 1908
5 85.62 0 3 0 1 0 1881
6 68.61 1 3 0 1 2 1881
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Introduction 9
Observed covariates are gender, zygosity and year of birth. A total of 246 twin
pairs with incomplete information about the cause of death were excluded,
leaving a study population of 7955 twin pairs. Individuals were followed up
through 31 December 1993, and those identified as deceased after that date
have been classified here as living. Altogether, we have 1344 male MZ twin
pairs and 2411 DZ twin pairs, and 1470 female MZ twin pairs and 2730 DZ
twin pairs. In addition to the lifetimes, there is information about cause of
death for all noncensored lifetimes, that is, for all individuals in the study
population who died before 31 December 1993. For the present analysis, only
the underlying cause of death was considered.
The data for the first six twins are given in Table 1.6. The first column gives
the identification number of the individual, and the second one the survival or
censoring time (in years). The third column contains the censoring indicator
(1 = death, 0 = alive), the fourth column is the identification number of
the twin pair (cluster), and the four other columns represent the covariates
gender (0 = female, 1 = male), zygosity (1 = monozygotic, 2 = dizygotic),
cause of death (0 = alive, 1 = cancer, 2 = coronary heart disease, 3 = stroke,
4 = respiratory diseases, 5 = other), and year of birth. For more detailed
information about cause of death, gender, and zygosity of the study population
see Table 1.7.
Information regarding death status, age at death, and cause of death was
obtained from the Central Person Register, the Danish Cancer Register, the
Danish Cause–of–Death Register, and other public registries in Denmark.
The main source for obtaining information on cause of death was the Death
Register at the National Institute of Public Health. Information about cause
of death is available from this register for individuals who died after 1942
(Juel and Helweg-Larsen 1999). Consequently, cause of death is included in
the twin register only for twins who died after this year.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
10 Frailty Models in Survival Analysis
Table 1.8: Cause of death groups by ICD number
cause of death ICD revision 6 & 7 ICD revision 8
The validity of the twin register was checked on the basis of a comparison of
information about year of death with the nationwide Danish Cancer Register.
There was around 99% agreement, although both registries were independent.
Further data corrections increased this level of agreement to almost 100%.
Cause of death was coded following the sixth, seventh, and eighth edition of
the International Classification of Diseases (ICD). Four different groups of
main causes of death are considered in the present example: cancer, coronary
heart disease (CHD), stroke, and diseases of the respiratory system. ICD
codes in three revisions of the ICD for these broad cause-of-death groups are
given in Table 1.8. Causes of death are a common example for competing
risks.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Introduction 11
Table 1.9: Data of three Danish twin pairs with covariates
id time status pair gender zygosity birth BMI smoking
1 74.23 1 1 0 2 1893 1 1
2 81.52 0 1 0 2 1893 2 2
3 72.58 1 2 0 2 1912 2 4
4 54.89 1 2 0 2 1912 1 2
5 57.31 0 3 1 1 1897 3 3
6 83.79 1 3 1 1 1897 1 4
males
BMI < 22 11 12 45 31 99 ( 9.4%)
BMI 22–28 78 141 333 208 760 (72.2%)
BMI > 28 24 35 91 43 193 (18.3%)
Total 113 188 469 282
(10.7%) (17.9%) (44.6%) (26.8%)
females
BMI < 22 105 46 47 99 297 (21.7%)
BMI 22–28 350 138 134 171 793 (58.1%)
BMI > 28 157 40 37 42 276 (20.2%)
Total 612 224 218 312
(44.8%) (16.4%) (16.0%) (22.8%)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
12 Frailty Models in Survival Analysis
1 76.09 1 1 1 1 1889 23
2 76.02 0 1 1 1 1889 26
3 64.73 0 2 0 2 1908 0
4 94.75 0 2 0 2 1908 23
5 85.62 0 3 0 1 1887 25
6 68.61 1 3 0 1 1887 29
The first column gives the identification number of the individual, and the
second column the age at diagnosis of breast cancer or censoring (in years).
The third column contains the censoring indicator (1 = onset of breast cancer,
0 = no onset of breast cancer), the fourth column the identification number of
the twin pair (cluster), and the other columns represent gender (0 = female,
1 = male), zygosity (1 = monozygotic, 2 = dizygotic), year of birth, and age
at giving first birth (in years). Zero indicates nulliparous women.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Introduction 13
The summary for the old cohort is given in Table 1.12, stratified according
to the censoring status. The event under study is the onset of breast cancer.
If a woman did not develop breast cancer or if she died from other causes
during the follow-up, the corresponding observation is censored. Age at onset
of breast cancer ranges from 36 years to 93 years.
The data set was created by merging the Swedish Twin Registry with the
Swedish Cancer Registry maintained by the National Board of Health and
Welfare. At the time of record linkage of the data used here, the Swedish
Cancer Registry contained all cases of cancer that were diagnosed during
the period 1959 through 2000, and 715 cases of breast cancer were identified
during follow-up.
In another analysis, without considering the covariate age at first birth, the
twins from the old and the middle cohort are combined. The middle cohort
comprises all twins born between 1926 and 1967 who were alive and living in
Sweden in 1970. Altogether, 1096 breast cancer cases were observed during
the follow-up until 2000 in both cohorts. The data structure is the same as in
Table 1.11 (but without covariate age). Summary statistics for both cohorts
combined are given in Table 1.13.
Table 1.13: Breast cancer in Swedish twins (old & middle cohort)
number of twin pairs
both censored one censored none censored total
MZ twin pairs 4304 335 33 4672
DZ twin pairs 7236 625 35 7896
total 11540 960 68 12568
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
14 Frailty Models in Survival Analysis
1 6 0 0
2 17 1 1
3 35 0 1
4 56 0 0
5 12 1 1
id is the identification number of the probands, who are the clusters. Time
refers to age of the probands when the sample was taken and is given in years.
Consequently, time denotes monitoring time and not event time. Status A and
B refer to the presence of antibodies with respect to the hepatitis A and B
virus, respectively. The data of five probands are given in Table 1.14.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Chapter 2
Survival Analysis
15
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
16 Frailty Models in Survival Analysis
The major concept in survival analysis is the hazard function. This function
is also called (depending on the field of application) mortality rate, incidence
rate, mortality curve, failure rate, or force of mortality. The hazard function
is defined by
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 17
For the topic covered by this monograph, the concept of the Laplace transform
L of a random variable is crucial to inference in this research area:
Z ∞
−uT ∗
L(u) = Ee = e−ut f (t) dt.
0
All the functions f, F, S, µ, M , and L provide equivalent specifications of the
distribution of the nonnegative random variable T ∗ . We use f, F, S, µ, M and
L as generic symbols without index, and their arguments make evident which
random variable is considered.
It is easy to derive relations between the different notions; for example,
(2.1) implies that
Z t Z t
f (s)
M (t) = µ(s) ds = ds = − ln(1 − F (t))
0 0 1 − F (s)
and consequently
Rt
S(t) = 1 − F (t) = e− 0
µ(s) ds
= e−M(t) . (2.2)
Example 2.1
Suppose that the random variable T ∗ follows a distribution with probability
density function
ν
f (t) = λνtν−1 e−λt , t ≥ 0, (2.3)
where λ, ν are one-dimensional nonnegative parameters. This is a Weibull
distribution, discussed in more detail in the next section. We consider it here
to illustrate the foregoing formulas. The following relations hold, starting
from the density (2.3)
ν
probability density function f (t) = λνtν−1 e−λt
ν
survival function S(t) = e−λt
hazard function µ(t) = λνtν−1
cumulative hazard function M (t) = λtν .
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
18 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 19
: if Ti∗ ≤ Ci ,
1 that is, Ti is not censored
∆i =
0 : if Ti∗ > Ci , that is, Ti is censored.
• Drop out. The treatment may have such strong side effects that it is
necessary to stop the therapy. Or the patient may refuse to continue
the treatment.
In Figure 2.1 the event times of patients 2, 4, and 5 are completely observed.
The event times of patients 1 and 3 are censored because of loss to follow-up,
drop out, or competing risks. Event times of patients 6 and 7 are censored
because of termination of study.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
20 Frailty Models in Survival Analysis
H(t) = P(min{T ∗, C} ≤ t)
= 1 − P(min{T ∗, C} > t)
= 1 − P(T ∗ > t, C > t).
Assuming independence between event time T ∗ and censoring time C implies
the simplification
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 21
P(T ≤ t, ∆ = 1) = P(T ∗ ≤ t, T ∗ ≤ C)
Z Z
= f (t∗ )g(c) dt∗ dc
t∗ ≤t,t∗ ≤c
Z Z
= f (t∗ ) g(c) dc dt∗
t∗ ≤t t∗ ≤c
Z
= f (t∗ )(1 − G(t∗ )) dt∗ 6= F (t) (2.4)
t∗ ≤t
THEOREM 2.1
The probability density function of the survival data (T, ∆) is
δ 1−δ
f (t, δ) = f (t)(1 − G(t)) g(t)(1 − F (t)) . (2.5)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
22 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 23
Example 2.2
Denote by (T, ∆), T = min{T ∗ , C}, ∆ = 1(T ∗ ≤ C) censored observations
under the assumption of dependent censoring. Let S(t∗ , c) and f (t∗ , c) be
the joint survival and probability density function of T ∗ and C, respectively.
Consequently, the subdistribution functions needed for the construction of the
likelihood function can be derived by
H1 (t) = P(T ≤ t, ∆ = 1)
= P(T ∗ ≤ t, T ∗ ≤ C)
Z Z
= f (t∗ , c) dc dt∗
{t∗ ≤t,t∗ ≤c}
Z t
=− S1 (t∗ , t∗ ) dt∗
0
dH1 (t)
h1 (t) = = −S1 (t, t).
dt
Similar calculations yield the subdistribution and subdensity functions in the
case of a censored observation (δ = 0):
H0 (t) = P(T ≤ t, ∆ = 0)
= P(C ≤ t, C < T ∗ )
Z Z
= f (t∗ , c) dt∗ dc
{c≤t,c<t∗ }
Z t
=− S2 (c, c) dc
0
with
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
24 Frailty Models in Survival Analysis
S1 (t, t; θ) = S1 (t, t) and S2 (t, t; θ) = S2 (t, t). For a sample (t1 , δ1 ), . . . , (tn , δn )
the likelihood function can be written as
n
Y δi 1−δi
L(θ) = − S1 (ti , ti ; θ) − S2 (ti , ti ; θ) .
i=1
In the case of independent censoring with S(t, c; θ) = 1 − F (t; θ) 1 − G(c)
this expression simplifies to (2.7).
Up to now, only the situation of right-censored event time data was considered
in the present monograph. However, in some cases, event times are only known
to lie in a specific interval. This situation especially arises when study subjects
are not under continuous observation, such as, for example, patients visiting
their doctor at predetermined times (or times that are convenient to them),
where the occurrence of the event can be diagnosed knowing that the event had
not occurred at the time of the previous visit. Another situation is inspection
times of technical equipment, where events can happen between two inspection
times. Hence, it is only known that the event occurred between two visits
or inspections but not the exact time point. This kind of censoring is called
interval censoring and was considered in detail by Sun (2006). In general, right
censoring is a special case of interval censoring and some of the methods for
right censored data can be directly, or with minor changes, applied to interval
censored data. However, most of the approaches for right-censored data are
not appropriate for interval-censored data because the censoring mechanism
behind interval censoring is much more complicated than in the case of right
censoring.
An important special case of interval-censored data are so called current
status data. The term current status data originates from applications in the
field of demography (Diamond et al. 1986). It means the observation on each
individual survival time interval includes either zero or infinity. Such kind of
data usually occur when each study subject is observed only once, and the
only available information for the event under study is whether the event has
occurred before the observation was taken. Consequently, current status data
are given in the form (T, ∆), where T denotes the monitoring time (which
is not the time when the event happens!) and ∆ is the indicator whether
the event already occurred before the monitoring or not. In the parametric
case, the likelihood function of a sample (t1 , δ1 ), . . . , (tn , δn ) with unknown
parameter vector θ to be estimated can be written in the form (Sun 2006)
n
Y δi
L(θ) = 1 − S(ti ; θ) S(ti ; θ)1−δi , (2.8)
i=1
with components depending on whether the event already occurred before the
monitoring times (δi = 1) or not (δi = 0).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 25
truncation time t+
2
lifetime T
truncation time t+
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
26 Frailty Models in Survival Analysis
δ 1−δ
f (t)(1 − G(t)) g(t)(1 − F (t))
f (t, δ, t+ ) = .
(1 − F (t+ ))(1 − G(t+ ))
As a consequence of this result, the likelihood function of univariate survival
data (t1 , δ1 , t+ + +
1 ), . . . , (tn , δn , tn ) with (nonrandom) left truncation times ti
(i = 1, . . . , n) in the parametric case with µ(t) = µ(t; θ) and θ as the vector
of parameters to be estimated can be written in the form
n
Y Z ti
µ(ti ; θ)δi exp −
L(θ) = µ(s; θ) ds .
i=1 t+
i
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 27
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
28 Frailty Models in Survival Analysis
n
Y n
Y
P(min{T1 , . . . , Tn } > t) = P(Ti > t) = e−λt = e−nλt .
i=1 i=1
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 29
Example 2.3
Let T1 , T2 , . . . , Tn be independent and identically distributed survival times
with cumulative distribution function F (t) = 1 − e−λt . The question arises:
How do we estimate the unknown parameter λ of interest when using the
maximum likelihood method?
n
Y R ti
L(λ) = µ(ti ; λ)δi e− 0 µ(s;λ) ds
i=1
n
Y
= λδi e−λti .
i=1
n n
∂ log L(λ) 1X X
= δi − ti
∂λ λ i=1 i=1
n n
1X X
δi − ti = 0
λ̂ i=1 i=1
n
P
δi
λ̂ = i=1
n
P
ti
i=1
The model is very sensitive to even a modest variation because it has only
one adjustable parameter. The inverse of parameter λ is both mean and
standard deviation. Recent works have overcome this limitation by using
more flexible distributions, which are introduced in the following sections.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
30 Frailty Models in Survival Analysis
ν
probability density function f (t) = λνtν−1 e−λt (λ > 0, ν > 0)
−λtν
survival function S(t) = e
hazard function µ(t) = λνtν−1
cumulative hazard function M (t) = λtν
1 1
expectation ET = λ− ν Γ(1 + )
ν
2 2 1
V(T ) = λ− ν Γ(1 + ) − Γ(1 + )2 ,
variance
ν ν
R∞
where Γ is the gamma function with Γ(k) = 0 sk−1 e−s ds (k > 0).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 31
5 W(2;1.25)
W(2;1.0)
W(2;0.5)
3
hazard
0
0 1 2 3 4 5
time
Figure 2.3: Weibull hazard functions with different shape parameters.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
32 Frailty Models in Survival Analysis
υκ(υt)κ−1
probability density function f (t) = (υ > 0, κ > 0)
(1 + (υt)κ )2
1
survival function S(t) =
1 + (υt)κ
υκ(υt)κ−1
hazard function µ(t) =
1 + (υt)κ
cumulative hazard function M (t) = ln(1 + (υt)κ )
4 logL(3;0.5)
logL(3;1)
logL(3;1.5)
3
hazard
0
0 1 2
time
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 33
λ ϕt
probability density function f (t) = λeϕt e− ϕ (e −1)
(λ > 0)
λ
−ϕ (eϕt −1)
survival function S(t) = e
hazard function µ(t) = λeϕt
λ
cumulative hazard function M (t) = (eϕt − 1)
ϕ
The hazard function is increasing starting from λ at time zero (Figure 2.5). For
parameter values ϕ < 0, the hazard function is decreasing, and the cumulative
hazard converges to the constant −λ/ϕ for t → ∞ so that not all individuals
in the population experience the event under study. This situation is discussed
in more detail in so-called cure models later on in this monograph. Obviously,
exponential distribution is a special case of Gompertz distribution in the case
of ϕ = 0. The Gompertz model (Gompertz 1825) was generalized to the
Gompertz–Makeham distribution (Makeham 1860) by adding a constant c to
the hazard function
µ(t) = λeϕt + c.
Here the additional parameter c describes a nonaging aspect in the study
population that is independent of time t, whereas the Gompertz part in the
formula still represents the age-dependent aspect with an exponential form.
The parameters λ and c are not identifiable in the case of ϕ = 0; only their
sum can be estimated.
The Gompertz–Makeham distribution describes the age dynamics of human
mortality rather accurately in the age range of about 30 – 80 years. At more
advanced ages the death rates do not increase as fast as predicted by this
mortality law – a phenomenon known as the late-life mortality deceleration.
This phenomenon was one of the starting points for developing univariate
frailty models.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
34 Frailty Models in Survival Analysis
20 G(0.1;0.05)
G(0.1;0.08)
18 G(0.1;0.10)
16
14
12
hazard
10
8
6
4
2
0
0 5 10 15 20 25 30 35 40 45 50
time
1 (log t−m)2
probability density function f (t) = √ e− 2s2
2πst
log t − m
survival function S(t) = 1 − Φ( )
s
1
φ( log t−m )
hazard function µ(t) = st s
log t−m
1 − Φ( s )
s2
expectation ET = em+ 2
2 2
variance V(T ) = e2m+s (es − 1)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 35
2 logN(0;0.5)
logN(0;1)
logN(0;1.5)
hazard
0
0 1 2 3 4 5
time
Figure 2.6: Log-normal hazard functions with different parameters.
x2
In the formulas used above φ(x) = √12π e− 2 denotes the probability density
function, and Φ(x) the cumulative distribution function, of the standard
normal distribution N (0, 1).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
36 Frailty Models in Survival Analysis
λk tk−1 e−λt
probability density function f (t) = (k > 0, λ > 0)
Γ(k)
survival function S(t) = 1 − Ik (λt)
λk tk−1 e−λt
hazard function µ(t) =
(1 − Ik (λt))Γ(k)
k
expectation ET =
λ
k
variance V(T ) = 2
λ
u
Laplace transform L(u) = Ee−T u = (1 + )−k
λ
If k = 1, gamma distribution is reduced to exponential distribution. With
integer k, gamma distribution is often called a special Erlangian distribution.
It can be derived as the distribution of the waiting time until the kth emission
from a Poisson source with intensity parameter λ. Consequently, the sum of k
independent exponential variables with parameter λ has a gamma distribution
with parameters k and λ (see Example 2.4) and can be used to model lifetimes
of technical systems with repeated repairing after failure.
Example 2.4
Let T1 , T2 , . . . , Tk denote k i.i.d. random variables with Ti ∼ Exp(λ) and
introduce T by T = T1 + . . . + Tk . Then it holds that
k k
Y Y u −1 u
L(u) = Ee−T u = Ee−(T1 +...+Tk )u = Ee−Ti u = (1 + ) = (1 + )−k ,
i=1 i=1
λ λ
which is the Laplace transform of a gamma distribution with parameters k
and λ. Here T1 , T2 , . . . , Tk denote times between repairing.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 37
ζ ω ζ+1
probability density function f (t) = (ω > 0, ζ > 0)
ω ω+t
ω ζ
survival function S(t) =
ω+t
ζ
hazard function µ(t) =
ω+t
ω
cumulative hazard function M (t) = −ζ log( )
ω+t
ω
expectation ET = (ζ > 1)
ζ −1
Pareto distribution can be seen as a gamma mixture of exponential distributed
lifetimes. It is often applied in areas including city population distributions,
stock price fluctuations, oil-field locations, and socioeconomic studies. Pareto
distribution is also a standard distribution for the purposes of reinsurance,
taking care of the largest claims of a portfolio. It can be extended to the
ζ
generalized Pareto distribution with hazard function µ(t) = ω+t + c. More
details about the generalized Pareto distribution can be found in Davis and
Feldstein (1979).
Example 2.5
We would like to derive the expectation of Pareto distribution under the
assumption ζ > 1.
Z ∞
ET = tf (t) dt
0
Z ∞
= S(t) dt
Z0 ∞
ω ζ
= dt
0 ω+t
ω ω ζ−1 ∞
=−
ζ−1 ω+t
0
ω
=
ζ−1
Here assumption ζ > 1 is important, otherwise the expectation is infinite.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
38 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 39
Assuming a continuous event time distribution, only one death will be observed
at most at each time ti with probability one. In practice, ties can occur, for
example, because of coarse measurement. To deal with ties, let ti denote the
r distinct observation times t1 < t2 < . . . < tr with r ≤ n. Taking this into
account, the Kaplan–Meier estimator is
Y di
Ŝ(t) = 1−
#R(ti )
i∈R(t)
with di the number of events at time ti . R(t) denotes the set of indices of all
individuals at risk at time t, meaning all individuals alive just before t. #R(t)
denotes the number of individuals in the risk set at time t. The Kaplan–Meier
estimator is a decreasing step function, changing only at time of an event. A
problematic point is that Ŝ is not defined after the largest observation time
if the last observation is a censored one. In this case, Ŝ(t) is usually left
unspecified after the largest observation time. One consequence of this is
that the mean lifetime cannot be estimated. A solution to this problem is to
assume that the survival function is zero after the largest time, which results
obviously in a biased estimate. A better solution is to consider the median
survival time.
Figure 2.7 illustrates the Kaplan–Meier method by presenting the estimates
of the survival curves of different patient groups in the EBCT study presented
in detail in Example 1.1. Here time from baseline investigation of patients
with coronary artery disease until the occurrence of a major adverse cardiac
event is the duration of interest. Consequently, in this application the notion
of survival is used in the more general sense of being free of well defined
major cardiac events. The patients were grouped according to four levels of
evidence of coronary artery disease. This allows a comparison of the event
times in these groups. It is easy to see that group is a very strong predictor
of event time; for example, individuals presenting with the lowest evidence
level in group 1 have around 98% chance to be event free compared to only
50% chance in group 4, which is the group of highest level of disease evidence.
Furthermore, jumps can be found at each event time point in the survival
curve estimates.
Throughout the book we assume continuous survival times preventing ties.
However, in many real-data applications, the researcher will be faced with
observations at the same time point. Different methods exist to handle
ties, which can produce different results. The exact method computes the
exact conditional probability under the proportional hazards assumption for
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
40 Frailty Models in Survival Analysis
1.0 group 1
group 2
group 3
0.9 group 4
0.8
survival
0.7
0.6
0.5
0.4
0 10 20 30 40 50
months
Figure 2.7: Kaplan–Meier curves for different patient groups in the EBCT
study.
all tied events occurring before censored observations of the same or larger
time. This is equivalent to summing all terms of the marginal likelihood
that are consistent with the observed data. The method assumes that ties
are due to lack of precision in measuring survival times and computes all
possible orderings of tied event times (Kalbfleisch and Prentice 1980). The
exact method needs abundant of computer resources for large data sets with
many ties. The discrete method assumes that events occurred at exactly
the same time and computes probabilities for events occurring to a set of
observations with tied event times. It consumes less computer resources than
the exact method. The most common method to handle ties is that by Breslow
(1974) using an approximate likelihood. Another method suggested later by
Efron (1977) also uses an approximate likelihood. If ties are not extensive,
the methods by Breslow and Efron provide satisfactory approximations to
the exact method for the continuous time-scale model. In general, Efron’s
approximation gives results that are much closer to the exact method results
than Breslow’s approximation does. If there are no ties, all three methods
result in the same likelihood and yield identical estimates. The Breslow
method is the most efficient one when there are no ties. With the exception
of the R package, nearly all statistical software use the Breslow method as
default. Throughout the book we have used this method.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 41
For the variance of the Kaplan–Meier estimate, the Greenwood formula (Green-
wood 1926) given by the expression
X di
V(Ŝ(t)) = Ŝ 2 (t)
#R(ti ) − di
i∈R(t)
Figure 2.8 shows Nelson–Aalen plots of the four different levels of evidence
for coronary artery disease in the EBCT study of Example 1.1. In agreement
with the Kaplan–Meier curves, patients of group 1 show a good prognosis,
whereas patients in groups with higher levels of evidence of the disease are
faced with an increasing risk of major cardiac events.
The Nelson–Aalen estimator is often used in choosing between different
parametric models. For this, one plots the estimator using a transformed
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
42 Frailty Models in Survival Analysis
0.7 group 1
group 2
group 3
0.6 group 4
0.5
cumulative hazard
0.4
0.3
0.2
0.1
0.0
0 10 20 30 40 50
months
Figure 2.8: Nelson-Aalen curves for different patient groups in the EBCT
study.
scale so that, if a given parametric model fits the data, the resulting graph
should be approximately linear. For example, a plot of M̂ (t) versus t will be
approximately linear in the case of an exponential distribution. Checking the
adequacy of a Weibull model, ln(M̂ (t)) versus ln(t) should be approximately
linear. The event times of the different groups in Figure 2.8 follow the
exponential model reasonably, with some deviations at later times in group 4.
Similar to the Kaplan–Meier plots, there are only jumps at event times. In
the case of continuous event times the functions S(·) and M (·) are related
by S(t) = e−M(t) (see (2.2)). Consequently, it seems reasonable to consider
S̄(t) = e−M̂(t) as an estimator of the survival function, which is different from
the Kaplan–Meier estimator and was suggested by Breslow in the discussion
of the paper by Cox (1972). Fleming and Harrington (1991) recommend
it as an interesting alternative and have pointed out that it has a slightly
smaller mean-squared error in some situations. Estimators for the hazard
function µ can be derived from the Nelson–Aalen estimator based on kernel
estimators. The formal derivations of the Kaplan–Meier and Nelson–Aalen
estimators and their asymptotic properties are performed in the framework of
counting processes (Aalen 1978, Fleming and Harrington 1991, Andersen et
al. 1993, Martinussen and Scheike 2006, Aalen et al. 2008). Both estimators
are asymptotically equivalent and quite close to each other, particularly when
the number of deaths is small relative to the number of individuals at risk.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 43
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
44 Frailty Models in Survival Analysis
truly nonparametric, and eβ denotes the hazard ratio between the two groups.
However, if X is continuous, a parametric form of h(·) is required. Inference
is now dependent on that parametric form but still independent of µ0 (t), and
the model is called a semiparametric model because of the parametric nature
of the covariate term and the nonparametric baseline hazard function. The
survival function given the covariates X is
β′X
S(t|X) = S0 (t)e ,
Rt
where S0 (t) = e− 0 µ0 (s) ds denotes the baseline survival function and the
components of the vector β are unknown regression parameters. That means
the survival function of an individual with covariate vector X is a power of
the baseline survival function. The class of distributions generated by this
procedure is sometimes called Lehmann alternatives.
Two different approaches are possible with the proportional hazards model.
Sometimes it occurs that covariates have skewed distributions, for example,
when only a small fraction of the individuals are exposed to the risk factor of
interest. It is also very common that a large fraction of lifetimes is censored.
Especially in large cohort studies analyzing the effect of a rare exposition
on an event, the number of exposed cases may be very small. One may
then question the validity of inference based on asymptotic results. In the
parametric case, the baseline hazard is chosen from the class of parametric
lifetime distributions. For example, starting from (2.7), the likelihood function
in the Weibull model with θ = (λ, ν) is of the form
n
Y ′ δi ν β ′ Xi
L(β, θ) = λνtν−1
i eβ Xi e−λti e (2.10)
i=1
Example 2.6
Table 2.1 gives the results for the parametric proportional hazards model
with the Weibull, exponential, and Gompertz baseline hazard, respectively,
analyzing the Halluca data from Example 1.3.
Analysis was performed using PROC NLMIXED in SAS by maximizing the
likelihood function given (2.10) and (2.11). The likelihood in the exponential
model is obtain from (2.10) with ν = 1. Both the Weibull and the Gompertz
model provide a much better fit compared to the exponential model based
on the values of the log-likelihood function. The cost of the better fit is
an additional parameter. The exponential model is nested in the Weibull
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 45
Table 2.1: Parametric proportional hazards models for
Halluca data.
parameter Weibull exponential Gompertz
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
46 Frailty Models in Survival Analysis
Plugging this discrete cumulative baseline hazard function into the likelihood
(2.12) results in
n
Y ′ X ′ δi
µ0 (ti )eβ Xi exp − µ0 (ti ) eβ Xj , (2.13)
i=1 j∈R(ti )
where R(t) denotes the risk set at time t containing all individuals (more
exactly their indices) that are still at risk of experiencing the event of interest
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 47
Plugging in this solution into expression (2.13), a likelihood function for the
regression parameters is obtained:
n ′
Y eβ Xi δi
L(β) = P β′ X . (2.15)
i=1
e j
j∈R(ti )
This expression is called partial likelihood and does not depend on the baseline
hazard µ0 (t), which simplifies parameter estimation. It is used to estimate
the regression coefficients in the semiparametric proportional hazards model.
The second derivative of the partial likelihood function is well behaved in the
sense that the negative of the second derivative is always positive definite (or
semidefinite). This is in contrast to the behavior of general likelihoods, which
are only known to fulfill this property locally around the parameter estimate.
The second derivative of the partial likelihood can be used to evaluate the
(asymptotic) variance.
Inference for the Cox estimator is almost exclusively based on asymptotic
results (Andersen and Gill 1982). The validity of these large sample properties
have been found acceptable with moderately large sample sizes, moderate
amount of censoring, and balanced covariates. This semiparametric model
is the most often applied one in survival analysis. It is implemented in all
statistical packages, is very easy to handle, and results allow an easy and
intuitive interpretation.
Expression (2.14) forms the basis for the well-known Breslow estimator of
the cumulative hazard function
n
X 1(ti ≤ t)δi
M̂0 (t) = P β′ X .
i=1
e j
j∈R(ti )
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
48 Frailty Models in Survival Analysis
Hypothesis testing in the Cox model for a single regression parameter, usually
for testing the null hypothesis H0 : βj = β0 (with β0 as specified value, often
zero) versus the alternative hypothesis HA : βj 6= β0 , can be performed in
three different ways. The first one is a likelihood ratio test based on the
partial likelihood L(·), treating it as an ordinary likelihood function. Under
the null hypothesis, the test statistic
L(β0 )
T = −2 log
L(β̂j )
β̂ − β 2
j 0
T = ,
se(β̂j )
l12
T =−
l2
where
∂ log L(β)
l1 = |βj =β̂j
∂βj
and
∂ 2 log L(β)
l2 = |βj =β̂j
∂βj2
are the first- and second-order derivatives of the logarithm of the partial
likelihood function under the alternative hypothesis. Similar to the other
two tests the test statistic follows under the null hypothesis asymptotically a
χ2 distribution with one degree of freedom.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 49
Example 2.7
In the following the Cox model is applied to the data from the Halluca study.
In Table 2.2 the parameter estimates of the (semiparametric) Cox model are
given. For comparison, the parameter estimates from the Weibull proportional
hazards model are given as well. Despite the fact that the baseline hazard in
The partial likelihood depends only on the ranking of the observations, that
is, the order in which the events happen, but not on the actual times. This can
be illustrated by varying observations, keeping the ranks unchanged. Then
there is no change in the parameter estimate, but if the ranks are changed,
the estimate is changed. In this sense, the parameter estimator is a step
function and not a continuous function. This underlines the importance of
measuring event times as accurately as possible (for example, in days) to get
the ranks right and to prevent tied observations. If this is impossible because
of important measurement error in the event times (as with cancer diagnosis),
it might be preferable to use a parametric Cox model to avoid discontinuities
in the likelihood function.
Samuelsen (2003) investigates advantages and limitations of exact inference
in the proportional hazards regression model and compares it with logistic and
conditional logistic regression.
Note that covariate vector X could vary with time, but this is beyond the
scope of this book.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
50 Frailty Models in Survival Analysis
Example 2.8
In the present analysis, PROC LIFEREG from SAS is used to apply Weibull,
exponential, and log-logistic AFT models to the Halluca lung cancer data from
Example 1.3. The results are given in Table 2.3. As already mentioned in
the paragraph about the parametric proportional hazards model, the model
with a parametric Weibull baseline hazard function is also an AFT model,
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 51
Table 2.3: Parametric AFT models for Halluca data
parameter Weibull exponential log-logistic
and parameters can be transformed from one model to the other. Denoting
the regression parameters in the AFT model by β ∗ for the parameters of the
proportional hazards model hold β = −νβ ∗ with ν from the Weibull baseline
hazard. Parameter λ is given by λ = exp(−ν × intercept). The interpretation
of the regression parameters in both models is completely different. Note
the opposite sign of the model parameters in the proportional hazards and
AFT models. The hazard ratio of patients with disease stage IV is around
e1.358 =3.89 times higher compared to patients with stage I (reference group)
in the proportional hazards model, interpreted as relative risk. In the AFT
model the expression e−1.584 = 0.21 indicates a reduction of survival time
by this factor, meaning that patients in stage group IV experience the event
around five times faster than patients in risk group I.
Similar to the parametric proportional hazards model, the Weibull model
shows a significant improvement compared to the exponential model. The
exponential AFT model is not nested in the log-logistic AFT model. Hence,
for comparison of the two models the Akaike Information Criterion (AIC)
should be used (Akaike 1974). The AIC statistic is given by the expression
-2(log-likelihood+ #parameters), where #parameters denote the number of
model parameters. Consequently, the AIC value is 8.11 for the comparison of
the exponential and log-logistic model. This favors the log-logistic compared
to the exponential model, which is not a proportional hazards model. Because
of the same number of parameters in both the Weibull as well as log-logistic
models the Weibull model performs better based on the log-likelihood. The
AFT model is more robust with respect to unobserved covariates compared
to the proportional hazards model (Hougaard et al. 1994, Hougaard 1999).
Orbe et al. (2002) compare Cox and AFT models in detail and discuss their
advantages and limitations.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
52 Frailty Models in Survival Analysis
Example 2.9
(Tsiatis 1975) Let T ∗ and C be nonnegative random variables with joint
cumulative distribution function
∗ ∗
−µc−ϑt∗ c
H(t∗ , c) = 1 − e−λt − e−µc + e−λt .
Z Z
H0 (t) = h(t∗ , c) dt∗ dc
{c<t,c<t∗ }
Z t
= Hc (∞, c) − Hc (c, c) dc
0
Z t
2
= (µ + ϑc)e−λc−µc−ϑc dc
0
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Survival Analysis 53
¯ = 0)
H̄0 (t) = P(T̄ < t, ∆
Z Z
= f (t∗ )g(c) dt∗ dc
{c<t,c<t∗ }
Z t
= (1 − F (c))g(c) dc
0
Z t
2
= (µ + ϑc)e−λc−µc−ϑc dc
0
and
¯ = 1)
H̄1 (t) = P(T̄ < t, ∆
Z Z
= f (t∗ )g(c) dt∗ dc
{t∗ <t,t∗ ≤c}
Z t
= (1 − G(t∗ ))f (t∗ ) dt∗
0
Z t
∗
−µt∗ −ϑ(t∗ )2
= (λ + ϑt∗ )e−λt dt∗ .
0
That means H0 (t) = H̄0 (t) and H1 (t) = H̄1 (t). Hence, the distributions of
¯ are equal. It is impossible
the observable random variables (T, ∆) and (T̄ , ∆)
to distinguish between the dependent and the independent model based on
the observed data (T, ∆). One possibility to circumvent this problem is the
inclusion of observed covariates into the model.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Chapter 3
Univariate Frailty Models
This chapter focuses on the analysis of univariate data, for example, event
times of unrelated individuals. Basic survival models deal with the simplest
case of independent and identically distributed data. This is based on the
assumption that the study population is homogeneous up to some observed
covariates. Such kind of models were considered in the last chapter. However,
it is a basic observation that individuals differ greatly, for example, with
respect to the effects of a drug, a treatment, or the influence of various
explanatory variables. This heterogeneity is often referred to as variability,
and it is one of the important sources of variability in medical, epidemiological
and biological applications. The issue of this chapter is unobserved hetero-
geneity in survival analysis. This heterogeneity may be difficult to assess, but
it is nevertheless of great importance. In recent decades, a large amount of
papers on frailty models have appeared. The key idea of these models is that
individuals have different frailties, and that the most frail will die earlier than
the less frail. Consequently, systematic selection of robust individuals takes
place, which biases what is observed. When mortality rates are estimated,
one may be interested in how they change over time or age. Quite often, they
rise at the beginning of the observation period, reach a maximum, and then
decline (unimodal hazard) or level off. This, for example, is typical for death
rates of cancer patients, meaning that the longer the patient lives beyond
diagnosis and treatment, the better her or his chances of survival are. But
it is often an open question whether this reflects changes in the individual
hazard. It is likely that unimodal hazards are often the result of selection and
that they do not reflect an underlying development on the individual level.
The population hazard starts to decline simply because high-risk individuals
have already died, but the hazard of a given individual might well continue
to increase.
If covariates are observed, then they can be included in the analysis, for
example, by using the proportional hazards model. However, it is nearly
impossible to include all important risk factors, perhaps because the researcher
has little or no information on the individual level. This applies, for example,
to population studies where often the only known variables are sex and age.
Furthermore, we may not know the relevance of the risk factor or even that
the factor exists. In other cases it may be impossible to measure the risk
factor without great financial cost or time effort. In such cases it is useful to
55
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
56 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 57
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
58 Frailty Models in Survival Analysis
The problem is that what can be observed in a study population is not the
conditional hazard but the net result for all the individuals with different
values of the random variable Z. The individual frailty variable Z is assumed
to be constant over time throughout the book. Approaches that relaxes this
restriction are discussed in Section 7.7.
It is quite clear that a multiplicative frailty model such as (3.1) represents
a rather simplified view of how heterogeneity might act. Nevertheless, simple
mathematical models represent one way of understanding the consequences
of heterogeneity. The assumptions that the frailty is timeindependent and
that it acts multiplicatively on the underlying baseline hazard function are
arbitrary, but they have been taken as the basis for much subsequent work on
unobserved heterogeneity in survival analysis.
Only for the sake of completeness would we want to mention other cases
of frailty models that are not based on the proportional hazards assumption.
For example, the additive frailty model where frailty acts additively on the
baseline hazard function. For more details of this model, see Rocha (1996),
Silva and Amaral-Turkman (2004), and Tomazella et al. (2006). A more
general model, including the additive as well as the multiplicative frailty model
as special cases, is considered by Gupta and Gupta (2009). Proportional odds
frailty models are covered in detail by Lam et al. (2002) and Lam and Lee
(2004). Murphy et al. (1997) point out the link between proportional odds
models and frailty models. AFT frailty models are dealt with, for example,
by Anderson and Louis (1995), who use the model in bivariate survival with
parametric and nonparametric frailty distributions. Keiding et al. (1997)
focus on the effect of heterogeneity caused by omitted covariates and found the
AFT model more stable than the proportional hazards model in the presence
of heterogeneity. Klein et al. (1999) consider a normal regression model
based on log-normal frailty distribution. Pan (2001) proposes a model for
correlated failure times by modeling the error term of the AFT model with
a frailty approach. Because of instability problems in Pan’s EM algorithm,
Zhang and Peng (2007) suggest a semiparametric estimation procedure based
on M-estimates and the EM algorithm. Xu and Zhang (2010) develop a more
stable estimation procedure in the semiparametric gamma frailty AFT model.
Lambert et al. (2004) study a parametric AFT model with an additive frailty
term. Further interesting papers in this direction are Schnier et al. (2004),
Chang (2004) and Komárek et al. (2007). Sankaran and Gleeja (2008) suggest
the use of proportional reversed hazards frailty models, which are of interest
in the case of left-truncated event time data.
It is natural to introduce observed covariates into model (3.1) similar to the
Cox model by
′
µ(t|X, Z) = Zµ0 (t)eβ X (3.2)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 59
Rt Rt
S(t|Z) = e− 0
µ(s|Z) ds
= e−Z 0
µ0 (s) ds
= e−ZM0 (t) . (3.4)
Rt
Here M0 (t) = 0 µ0 (s) ds denotes the cumulative baseline hazard function,
and equation (3.4) is a generalization of relation (2.2). Up to now, the model
has been described at the individual (conditional) level. However, data for this
individual model are not observable. Consequently, it is necessary to consider
the population level where the frailty term is integrated out. The population
survival function is the weighted mean of the conditional survival functions
with weights given by the density function of the frailty distribution. The
population survival function is obtained from the conditional survival function
S(t|Z) by integrating out the frailty. It can be viewed as the (unconditional)
survival function of an individual randomly drawn from the study population,
and corresponds to what can actually be observed:
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
60 Frailty Models in Survival Analysis
example, density and hazard function of the event times and expectation and
variance of the frailty can be characterized by the Laplace transform of the
frailty distribution and their derivatives
assuming the existence of the foregoing expressions. For example, in the case
of a positive stable distribution the moments do not exist. Here, L′ and L′′
denote the first and second derivative of the Laplace transform. Thus, if the
Laplace transform has a simple form, performing this calculation is easy. The
connection with the Laplace transform was first pointed out and exploited
by Hougaard (1984, 1986a,b). It follows that, when seeking distributions for
the frailty variable Z, it is natural to use those which have explicit Laplace
transforms. This simplifies parameter estimation.
Frailty distribution describes the frailty in the population at the start of the
follow-up. Frailty is assumed to be fixed for each individual over time, but the
composition of the population changes as time goes by. On average, the more
frail individuals die earlier. Due to this fact the frailty distribution in the
population at risk changes over time. The following theorem points on this
fact and establishes the link between the conditional and the unconditional
model.
THEOREM 3.1
(Vaupel et al. 1979) Assume a frailty model given by formula (3.1). The
f (t)
population hazard µ(t) = S(t) is generally µ(t) = E(µ(t|Z)|T > t), or more
specifically,
Z ∞
µ(t) = µ(t|z)f (z|T > t) dz
0
Z ∞
= µ0 (t) zf (z|T > t) dz, (3.7)
0
where f (z|T > t) represents the frailty density among the survivors of time
point t.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 61
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
62 Frailty Models in Survival Analysis
2
Z=2
Z=1
Z=0.5
unconditional
hazard
0
0 10 20 30 40 50
time
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 63
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
64 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 65
EZ = πη1 + (1 − π)η2
V(Z) = πη12 + (1 − π)η22 − (πη1 + (1 − π)η2 )2
= π(1 − π)(η1 − η2 )2 .
The survival and density function can now be easily derived from the Laplace
transform by using (3.5)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
66 Frailty Models in Survival Analysis
0.2 subpopulation 1
mixed population
subpopulation 2
hazard
0.1
0.0
0 20 40 60 80 100
age
1 − πη1
η2 = . (3.11)
1−π
Consequently, the model contains, with π and η1 , two additional parameters
compared to the model without binary frailty. Note that the condition EZ = 1
is not necessary in the binary frailty model, but is used here to make models
comparable. Using (2.7) and highlighting the dependence of the baseline
hazard distribution on parameter vector θ, the likelihood function in the
binary frailty model with η = η1 is of the form
n 1−πη
Y πηe−ηM0 (ti ;θ) + (1 − πη)e− 1−π M0 (ti ;θ) δi
L(θ, η, π) = 1−πη µ0 (ti ; θ) (3.12)
i=1 πe−ηM0 (ti ;θ) + (1 − π)e− 1−π M0 (ti ;θ)
1−πη
× πe−ηM0 (ti ;θ) + (1 − π)e− 1−π M0 (ti ;θ) .
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 67
Denote by π(t) the proportion of individuals with the disease genotype (sub-
population 1) at time point t (e.g., π = π(0)). Consequently, 1 − π(t) denotes
the proportion of the second subpopulation without the disease genotype.
Here, special focus is on the proportion π(t). This quantity changes over time
as a result of the selection process. Selection takes place if the lifetimes of
different subpopulations follow different mortality patterns. To calculate π(t),
we introduce the conditional survival function for individuals with the disease
genotype and without the disease genotype. Conditioning is with respect to
the event {J = j}, meaning that the individual belongs to subpopulation j
(j = 1, 2): P(T > t|J = j) = e−ηj M0 (t) . By using the Bayesian formula we
can now write the proportion of the fraction in the form
P(T > t, J = 1)
π(t) = P(J = 1|T > t) =
P(T > t)
P(T > t|J = 1)P(J = 1)
=
P(T > t|J = 1)P(J = 1) + P(T > t|J = 2)P(J = 2)
πe−η1 M0 (t)
= . (3.13)
πe−η1 M0 (t)+ (1 − π)e−η2 M0 (t)
The binary frailty is sometimes also called two-point frailty. Nickell (1979)
used this model to account for heterogeneity in unemployment spell data. The
population was divided into groups of motivated and nonmotivated searchers
of a job. Furthermore, this frailty distribution was used by Vaupel and Yashin
(1985) to discuss ideas of the heterogeneity and selection concept in detail,
and by Schumacher et al. (1987) to model heterogeneity in clinical trials. A
special case of binary frailty is dividing the population into a proportion that is
at risk and a proportion that is never at risk. The terminology to describe the
never-at-risk group varies from field to field. It includes ”long-term survivors”
(Farewell 1982) or ”cured” in epidemiology (Price and Manatunga 2001),
”nonsusceptibles” in toxicology (Pack and Morgan 1990), ”stayers” in finite
Markov transition models of occupational mobility (Blumen et al. 1955), the
”nonfecundable” in fertility models (Heckman and Walker 1990), and ”non-
recidivists” among convicted criminals (Schmidt and Witte 1989, Maller and
Zhou 2002). We will use the term cure model to describe such models and
come back to this problem later on in Section 3.12 and Section 5.8.
Example 3.1
In Table 3.1 the results are given for the parametric binary frailty model
with Weibull baseline hazard analyzing the Halluca lung cancer data. For
comparison, the results of the parametric proportional hazards model with
Weibull baseline hazard are given as well. The analysis was performed using
PROC NLMIXED in SAS by specifying the likelihood function given in (3.12),
substituting the expressions µ0 (ti ) and M0 (ti ) with their regression counter-
′ ′
parts µ0 (ti )eβ Xi and M0 (ti )eβ Xi , respectively. The two-point frailty model
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
68 Frailty Models in Survival Analysis
1.0 subpopulation 1
subpopulation 2
0.8
proportion
0.6
0.4
0.2
0.0
0 20 40 60 80 100
age
with Weibull baseline hazard provides a better fit compared to the Weibull
proportional hazards model based on the comparison of the log-likelihood (see
Table 3.1). Both mass points η1 , η2 of the frailty are very different from each
other. The model detects a very small subgroup of individuals with extremely
high risk of death. Further detailed analysis shows that this group consists of
patients in whom lung carcinoma was detected after death in the pathology
(cluster 4 in Figure 1.1). Hence, for these patients, the event time is zero.
The standard error for η2 is not given because it is not a free parameter in
the model (see (3.11)).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 69
Table 3.1: Parametric binary frailty model for
Halluca data
parameter Weibull PH two-point frailty
Similar to the binary frailty model denote by πj (t) the size of the fraction of
k
P
subpopulation j at time point t (e.g., πj (t) = 1) and πj = πj (0). We are
j=1
now interested in the behavior of πj (t). To calculate this function we introduce
the conditional survival function. Again, conditioning is with respect to the
event {J = j}, meaning that the individual belongs to the subpopulation j
(j = 1, 2, . . . , k).
Rt
Sj (t) = P(T > t|J = j) = e−ηj M0 (t) = e−ηj 0
µ0 (s)ds
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
70 Frailty Models in Survival Analysis
Example 3.2
In addition to binary frailty, a three-point as well as a four-point frailty
distribution with Weibull baseline hazard function was applied to the Halluca
data. The results are given in the third and fourth column of Table 3.2. Again,
the small subgroup with extreme high risk as in the two-point frailty model
can be found. Furthermore, the log-likelihood indicates a further improvement
in the model by additional mass points in the discrete frailty distribution. It
turns out that a five-point frailty model does not provide further significant
improvement in the fit to the Halluca data (results are not shown).
We would like to point out that, in the literature, this model with a fixed
number of discrete mass points is sometimes called a nonparametric frailty
model, which is of course a misleading name. The notion of nonparametric
(discrete) frailty should be reserved for models where the number of mass
points is a random variable, which means the number of mass points is an
additional parameter to be estimated (Heckman and Singer 1982b, dos Santos
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 71
Table 3.2: Discrete frailty models for Halluca data
parameter 2-point frailty 3-point frailty 4-point frailty
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
72 Frailty Models in Survival Analysis
σ2 = 1
σ2 = 0.5
σ2 = 0.25
1.0
σ2 = 0.125
0.8
density
0.6
0.4
0.2
0.0
0 1 2 3 4
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 73
1 k
Z
L(u) = λ e−uz z k−1 e−λz dz
Γ(k)
λk 1
Z
= k
(λ + u)k z k−1 e−(λ+u)z dz
(λ + u) Γ(k)
u
= (1 + )−k .
λ
Here, the equivalence of the second and third line is a consequence of the
integration over the density of a gamma distribution with parameters k and
λ + u. The first and second derivatives of the Laplace transform are
k u
L′ (u) = − (1 + )−k−1 (3.14)
λ λ
′′ k(k + 1) u
L (u) = (1 + )−k−2 . (3.15)
λ2 λ
Evaluating these derivatives at u = 0 implies
k
EZ =
λ
k(k + 1) k 2 k
V(Z) = − 2 = 2
λ2 λ λ
To make sure that the model is identifiable, the restriction k = λ is used
for the gamma distribution, which results in EZ = 1. Denote by σ 2 := λ1 the
variance of the frailty variable. The density of a gamma-distributed random
variable Z ∼ Γ( σ12 , σ12 ) is given by
1 1 σ12 12 −1 z
f (z) = 1 2
zσ exp − 2 (3.16)
Γ( σ2 ) σ σ
and is depicted in Figure 3.4.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
74 Frailty Models in Survival Analysis
1
S(t) = L(M0 (t)) = 1 . (3.17)
(1 + σ2 M 0 (t)) σ
2
µ0 (t)
f (t) = 1
+1
(1 + σ2 M 0 (t)) σ
2
µ0 (t)
µ(t) = . (3.18)
1 + σ 2 M0 (t)
Furthermore, it turns out that the assumption about a gamma distribution
of the frailty at the start of the follow-up yields some useful mathematical
results, which will be discussed in detail in the following. Mainly for reasons
of convenience, frailty is assumed to be constant over time for each individual.
But because of the selection process that takes place as time goes by, the
distribution of the frailty in the population still at risk changes over time.
In the following presentation we extend the model by including Cox type
regression terms into the formulas meaning substitution of the cumulative
′
baseline hazard M0 (t) with M0 (t)eβ X . We will show that the frailty among
survivors of a specific time point is still gamma distributed but now with
new parameters depending on the original parameter σ 2 and the cumulative
′
hazard function M0 (t)eβ X . Furthermore, a similar result can be obtained for
individuals dying at a specific time point. These results are helpful for the
development of estimation procedures especially in semiparametric gamma
frailty models. Using relations (3.4), (3.8), (3.16), and (3.17), the density of
the frailty distribution among the survivors (indicated by the condition T > t)
can be written in the form
1 β′ X
12
+ M0 (t)e σ
1
1 β′ X
= σ2
z σ2 −1 exp − z + M 0 (t)e ,
Γ( σ12 ) σ2
which is the density of a gamma distribution with the same value of the
shape parameter σ12 as at the begin of the follow-up. The value of the second
′
parameter, however, is now given by σ12 + M0 (t)eβ X . In a similar way (3.9)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 75
can be used to calculate the frailty distribution of individuals who die at time
point t:
1
1
β ′ X σ2 +1
σ2 + M0 (t)e 1 1 ′
z σ2 exp − z( 2 + M0 (t)eβ X )
= 1 1
Γ( σ2 ) σ2 σ
1
1
β ′ X σ2 +1
σ2 + M0 (t)e 1 1 ′
z σ2 +1−1 exp − z( 2 + M0 (t)eβ X ) .
= 1
Γ( σ2 + 1) σ
′
This is a gamma density with the same scale parameter σ12 + M0 (t)eβ X as
among those surviving to t but with shape parameter σ12 + 1. In particular,
it follows that the mean frailty among the deaths at age t is
1 + σ2
E(Z|X, T = t) = (3.19)
1 + σ 2 M0 (t)eβ X
′
compared to
1
E(Z|X, T > t) = (3.20)
1 + σ 2 M0 (t)eβ X
′
among the survivors at the same age. This demonstrates the selection by
early death of the high-risk individuals. The individuals dying at time t have a
higher mean frailty compared to the survivors of this time point. Furthermore,
it holds that the frailty variance among the individuals dying at time t is
σ 2 (1 + σ 2 )
V(Z|X, T = t) = (3.21)
(1 + σ 2 M0 (t)eβ X )2
′
and
σ2
V(Z|X, T > t) = (3.22)
(1 + σ 2 M0 (t)eβ X )2
′
among the survivors. Consequently, the variance of frailty declines also over
time, so the study population becomes more homogeneous in absolute terms.
However, the coefficient of variation stays constant over time, so the population
does not become more homogeneous relative to the mean.
The univariate gamma frailty model (without covariates) was introduced by
Vaupel et al. (1979). In an earlier paper, Beard (1959) used a two-parameter
gamma frailty distribution and a Gompertz–Makeham baseline hazard.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
76 Frailty Models in Survival Analysis
Example 3.3
Let µ0 (t) = λeϕt be a Gompertz baseline hazard and frailty follows a gamma
distribution Z ∼ Γ( σ12 , σ12 ). Hence, the unconditional hazard and survival
function are given by the expressions
λeϕt
µ(t) = λ ϕt
1 + σ2 ϕ (e − 1)
λ 1
S(t) = (1 + σ 2 (eϕt − 1))− σ2
ϕ
Figures 3.5 and 3.6 deal with an important topic related to univariate frailty
models, the so-called crossing-over effects in the mortality hazards of different
populations. Figure 3.5 shows the logarithms of two proportional baseline
hazards, where the second population has a higher mortality. Assuming a
higher degree of heterogeneity in this population (the variance of the gamma-
distributed frailty is larger in the second population) results in the crossing-
over effect because of selection shown in Figure 3.6. Higher heterogeneity
implies a higher selection pressure in the second population.
2 population 1
population 2
-2
log(hazard)
-4
-6
-8
-10
0 20 40 60 80
time
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 77
0 population 1
population 2
-1
-2
-3
log(hazard)
-4
-5
-6
-7
-8
-9
-10
0 20 40 60 80
time
Rt
with M0 (t) = 0 µ0 (s) ds. Clearly, the average frailty value is a decreasing
function of time. The decrease is faster in situations with larger heterogeneity
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
78 Frailty Models in Survival Analysis
µ(t|X = 1) 1 + σ 2 M0 (t) β
= e . (3.23)
µ(t|X = 0) 1 + σ 2 M0 (t)eβ
A critical point here is that, under the proportional hazards assumption for
the conditional model, the ratio of the treatment-specific population hazards
is not generally timeinvariant; unless σ 2 or β is zero, the population hazard
ratio (3.23) is a decreasing function. Only at time zero the population hazard
ratio is equal to the conditional hazard ratio eβ . As t increases, the ratio tends
to one. Under these conditions, the time-invariant hazard ratio obtained by
applying a simple population model is attenuated from the conditional hazard
ratio eβ . The attenuation is increased in situations with larger values of σ 2 , β
and M0 (t), representing the three factors that accelerate frailty selection.
Conversely, the attenuation is modest if any of these factors is near zero, in
which case the individual- and population-averaged parameters are relatively
close to each other. Formal expressions and approximations were obtained by
Henderson and Oman (1999) who consider the effects of fitting a population
model to conditional covariate effects. Such selection effects occur in all
univariate frailty models; they are not restricted to the gamma frailty model.
The advantage of the gamma frailty model is that, for all important quantities,
explicit formulas can be derived in an easy way.
Example 3.4
Assume a gamma-distributed frailty model with constant baseline hazard
function µ0 (t) = λ (exponential distribution). Consequently, using (3.17),
1
S(t) = (1 + σ 2 M0 (t))− σ2
1
= (1 + σ 2 λt)− σ2
1
= (1 + λt)−ζ
ζ
1
= (1 + t)−ζ
ω
ω ζ
=( ) ,
ω+t
with ζ := σ12 and ω := λζ , which is a Pareto distribution with parameters ω
and ζ. Clayton and Cuzick (1985a) considered this model.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 79
Example 3.5
In Table 3.3, results are given for the parametric gamma frailty model with
Weibull baseline hazard function analyzing the Halluca data. For comparison,
the results of the parametric proportional hazards model with Weibull baseline
hazard function are given as well. The analysis was performed using PROC
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
80 Frailty Models in Survival Analysis
to the Weibull proportional hazards model based on the value of the log-
likelihood because of the additional parameter σ 2 . The Weibull proportional
hazards model is nested in the parametric gamma frailty model with Weibull
baseline hazard. Consequently, the models can be compared by the likelihood
ratio test, which yields a test statistic χ2 = −2(−4858.672+4853.032) = 11.28.
Because the parameter value under the null hypothesis σ 2 = 0 to be tested lays
on the boundary of the parameter space, standard methods cannot be applied.
It turns out that the approximation of the likelihood ratio test statistic by
a χ2 -distribution with one degree of freedom is too conservative (Self and
Liang, 1987). Rather, a mixture of a 12 χ20 and a 21 χ21 distribution should
be used (Claeskens et al. 2008). In the present example, this results in
a highly significant p-value. Hence, the additional parameter in the frailty
model causes a significant improvement in the fit to the data. This agrees
with the small standard error of the estimate of the frailty variance, which also
supports the hypothesis σ 2 > 0. In the frailty model – similar to the Weibull
proportional hazards model – an increasing risk of death is seen for increasing
disease stage. However, the interpretation of the hazard ratios in the frailty
model is slightly different from the interpretation in the Weibull proportional
hazards model. For example, the hazard ratio of patients with disease stage
IV is e1.562 =4.77 times higher compared to patients in risk group I given the
same value of frailty. Patients in group I serve as reference group. Hazard
ratios have to be interpreted conditional on the frailty. In the proportional
hazards model with Weibull baseline, the analogous hazard ratio for stage IV
is e1.358 =3.89, meaning an increase in risk by this factor comparing patients
with disease stage IV and I with similar values of all other risk factors in the
model. Consequently, in the Weibull proportional hazards model (as well as
in the semiparametric Cox model), comparisons are made based on (more or
less hypothetical) groups where all observed covariate values are similar. In
the frailty model, comparisons are made based on groups where all observed
and unobserved covariate values are assumed to be similar.
The obtained estimates of the model parameters depend on the parametric
assumption about the Weibull baseline hazard. Furthermore, in the frailty
model, the parameter estimates depend additionally on the assumption of
gamma distribution of frailty. In the following sections the analysis is repeated
with different assumptions about frailty distribution. Typically, an absolute
increase in regression parameter estimates and their standard errors can be
observed. One exception here is variable type.
Different parametric gamma frailty models are applied to the Halluca data.
The baseline hazard function is assumed to be exponential, Gompertz and
Weibull distributed. The results are given in Table 3.4. It is easy to see that
the results depend on the parametric form of the baseline hazard function.
Nevertheless, in all cases, the estimated frailty variance is different from zero
indicating the presence of unobserved heterogeneity. Careful considerations
are necessary because the null hypothesis σ 2 = 0 is again on the boundary
of the parameter space. For further details about tests for heterogeneity
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 81
Table 3.4: Parametric gamma frailty models for Halluca data
parameter Weibull exponential Gompertz
in the proportional hazards models see Section 7.5. The higher flexibility
of the Gompertz and Weibull baseline hazard does not imply a significant
improvement based on the likelihood function. If the models are not nested,
then the Akaike information criteria (AIC) can be used. Consequently, the
exponential model should be preferred among the three parametric models
presented in Table 3.4.
This links to a general problem in the field of frailty modeling. On the
one hand, parametric estimation procedures (parametric specification of the
baseline hazard) are much easier to implement compared to semiparametric
models. Furthermore, standard errors of the frailty variance estimate are easy
to obtain. On the other hand, in many practical applications the investigator
has only limited information about the form of the baseline hazard function,
which makes parametric analysis difficult. Estimations in parametric frailty
models can be performed with an SAS macro described by Liu and Yu (2008).
This approach requires a closed-form expression of the frailty density, which is,
for example, available for the gamma distribution. Otherwise, in the gamma
case, the integrals in the conditional likelihood function can be integrated
out explicitly without using the approximative method of Liu and Yu (2008).
Here the latter method was used.
Example 3.6
In Table 3.5 the results are given for the parametric gamma frailty model
with exponential baseline hazard analyzing the EBCT data. For comparison,
the results of the parametric proportional hazards model with exponential
baseline hazard function are given as well. The analysis was performed using
PROC NLMIXED in SAS by specifying the likelihood function given in (3.25).
The gamma frailty model with exponential baseline hazard provides a better
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
82 Frailty Models in Survival Analysis
Table 3.5: Parametric gamma frailty model for the
EBCT data
parameter gamma frailty exponential PH
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 83
In Table 3.6 parametric gamma frailty models with Weibull, exponential, and
Gompertz baseline hazard are compared. The gamma frailty model with
Table 3.6: Parametric gamma frailty models for the EBCT data
parameter Weibull exponential Gompertz
Weibull baseline hazard function shows the best fit to the data based on the
log-likelihood value. The variance of the frailty is significantly larger than
zero in this model, whereas in the models with exponential and Gompertz
baseline no significant unobserved heterogeneity can be detected. This makes
the main drawback of parametric frailty models obvious. The interpretation of
the results (here especially the question if unobserved heterogeneity is present
or not) depends one the choice of the baseline hazard. Usually, the researcher
has no additional prior information about the form of the baseline distribution.
Consequently, the decision for the best-fitting model should be based on the
likelihood ratio test if the models are nested. Here the nonstandard test
situation for testing the hypothesis σ 2 > 0 versus σ 2 = 0 has been taken into
account. If the models are not nested, then the Akaike information criteria
(AIC) should be used.
1/σ̂ 2 + δi
Ẑi = ′ . (3.26)
1/σ̂ 2 + M0 (ti ; θ̂)eβ̂ Xi
Here σ̂ 2 denotes the estimate of the frailty variance, θ̂ the vector of parameter
estimates of the cumulative baseline hazard, and β̂ the vector of estimated
regression coefficients. A parametric gamma frailty model with exponential
baseline hazard is considered, for example, by Tomazella et al. (2008).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
84 Frailty Models in Survival Analysis
n
Y
L(β, σ 2 |Z) = f (ti , δi , Zi ; β, σ 2 )
i=1
Yn n
Y
= f (ti , δi , Zi ; β) f (Zi ; σ 2 )
i=1 i=1
= L1 (β|Z)L2 (σ 2 |Z) (3.27)
n
Y ′ δi β ′ Xi
L1 (β|Z) = Zi µ0 (ti )eβ Xi e−Zi M0 (ti )e (3.28)
i=1
from relation (3.25), which is the likelihood function of the observed event
times conditional on the frailties.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 85
The second term is given by the probability density of the frailty variables:
n
Y
L2 (σ 2 |Z) = f (Zi ; σ 2 ).
i=1
The unknown random variables Zi and log(Zi ) are now substituted by their
current expected values (at iteration step k) E(k) (Zi ) and E(k) (log Zi )
n
X h X i
′
log L(β, σ 2 ) = δi β ′ Xi + E(k) (log(Zi )) − log E(k) (Zj )eβ Xj .
i=1 j∈R(ti )
From this expression, new estimates β(k) can be obtained using standard soft-
2
ware. A new estimate of the frailty parameter σ(k) is derived by maximization
of L2 (σ 2 |Z) also replacing the unknown variables Zi by their current expected
values at iteration step k.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
86 Frailty Models in Survival Analysis
Example 3.7
To account for heterogeneity within the failure times, a semiparametric shared
gamma frailty model was applied to the EBCT data. Results are given in
Table 3.7. The typical inflation of the effect of the covariates is seen, combined
with larger standard errors. One exception is the covariate risk group II, whose
estimate lowers from 0.23 (0.92) in the Cox model to 0.19 (0.96) in the gamma
frailty model. The variance of the frailty is large but not significantly different
from zero when taking into account the size of the standard error. The simple
Cox model seems to be appropriate to model the data compared to the more
complex gamma frailty model. The SAS macro SPGAM written by Hien Vu
(Vu and Knuiman 2002) was used for analysis. It is based on a modified
version of the EM algorithm. The above-mentioned correction by Barker and
Henderson (2005) is not included in this macro.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 87
′ ′ ′
µ(t|X, Z) = Zµ0 (t)eβ X
= µ0 (t)eβ X+ln(Z) = µ0 (t)eβ X+W
. (3.31)
log Lppl (β, σ 2 |W) = log Lpart (β|W) + log Lpen (σ 2 |W) (3.33)
with
n
Y
Lpen (σ 2 |W) = f (Wi ; σ 2 )
i=1
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
88 Frailty Models in Survival Analysis
log Lppl (β, σ 2 |W) based on a provisional value of σ 2 . In the outer loop, σ 2
is obtained as maximization of a profiled version of the marginal likelihood.
This version is obtained as follows: For a specific value of σ 2 , estimates of
β and W are obtained by maximizing the penalized partial likelihood given
σ 2 . Based on these estimates, other estimates for the cumulative hazard
function can be found using the Nelson–Aalen estimator similar to the EM
algorithm. As in the EM algorithm, this profile marginal likelihood for σ 2 is
used to find an estimate for the frailty variance. The procedure is iterated
until convergence. More details about the EM algorithm and the penalized
partial likelihood approach in the semiparametric gamma frailty can be found
in Duchateau and Janssen (2008). The interested reader is also referred to
this book with respect to a detailed derivation of the fact that the estimates
in both approaches coincide in the gamma frailty model. This is not true in
general for other frailty distributions. In the following example the penalized
partial likelihood approach is applied.
Example 3.8
We reanalyze the Halluca lung cancer data again. A semiparametric gamma
frailty model is applied in Table 3.8. Results of the Cox model are given
for comparisons. As expected, the effect of the covariates is smaller (the
only exception is covariate type) in the Cox model when not corrected for
unobserved heterogeneity in the study population. The gamma frailty model
is able to account (at least in part) for this unobserved heterogeneity. Note
that the standard errors also increase in the gamma frailty model. Estimation
is based on the penalized partial likelihood approach and performed by the
R package coxph. This procedure (adapted from the Splus macro by Thierry
Therneau) provides no standard error of the frailty variance estimate.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 89
where log L(β, σ 2 ) is the full log-likelihood in the gamma frailty model and
κ is a smoothing parameter. The integral represents a trade-off between the
faithfulness to the data, as represented by the term log L, and smoothness
of the solution, as represented by the squared norm of the second derivative
of the baseline hazard in the penalty term. For large κ, the integral will be
forced toward zero, and the baseline hazard will approach a linear function. If
κ is small, then the main contribution to log Lppl will be log L and the curve
estimate will track the data closely, but will be more irregular.
Other approaches such as the Markov Chain Monte Carlo (MCMC) method
have also been suggested for estimation in the semiparametric gamma frailty
model. The new PROC MCMC routine in SAS offers flexible alternatives
compared to the traditional WinBugs software often used for MCMC analyses.
Another choice for estimation in frailty models is Gaussian quadrature. It
approximates the integral of a parametric function with respect to a frailty
density by a weighted sum over predefined abscissas for the frailties. However,
all current implementations make a parametric choice of the baseline hazard
function (especially exponential, piecewise constant), see Yamaguchi et al.
(2002), Littell et al. (2006), Nelson et al. (2006), Liu and Huang (2008),
and Liu and Yu (2008). However, a piecewise constant hazard function with
raising number of pieces can be used to approximate semiparametric models.
The SAS routine PROC NLMIXED provides a powerful tool for adaptive as
well as nonadaptive Gaussian quadrature.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
90 Frailty Models in Survival Analysis
Plugging in the survival function of the gamma frailty model with Gompertz
baseline from Example 3.3 and θ = (λ, ϕ), the likelihood function becomes
n
Y λ − 1 δi λ − 1−δ2 i
L(λ, ϕ, σ 2 ) = 1 − 1 + σ 2 (eϕti − 1) σ2 1 + σ 2 (eϕti − 1) σ
.
i=1
ϕ ϕ
Example 3.9
The hepatitis A and B current status data from Example 1.7 are analyzed
in detail. A univariate parametric gamma frailty model with the Gompertz
baseline hazard function as described earlier is fitted to the hepatitis data.
Age-dependent prevalences of hepatitis A and B are shown in Figures 3.7 and
3.8, respectively. Results of the analysis are given in Table 3.9. The univariate
frailty model treats time to infection by hepatitis A and B, respectively, as
independent times for each individual. It turns out that the estimation of the
frailty variance regarding hepatitis B is difficult, because the likelihood is very
flat with respect to σ 2 . This is also indicated by a large standard error of σ 2 .
A reason for this variability is the smaller prevalence of hepatitis B compared
to hepatitis A.
0.8
prevalence
0.6
0.4
0.2
0 20 40 60 80
age
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 91
0 20 40 60 80
age
Figure 3.8: Age-specific prevalence of hepatitis B in Flanders 1994/95. The
circles indicate the size of the respective age group on which the prevalence
estimates are based.
λ ϕt
M02 (t) = M01 (t) + (e − 1),
ϕ
where M02 (t) denotes the cumulative baseline hazard of the treatment group.
Frailty in the treatment group is now gamma distributed with mean r and
variance γσ 2 . Hence, changes in the frailty distribution caused by treatment
are given by parameters r and γ. Parameter r < 1 shows an increase in
the average robustness (decrease in mean frailty), while r > 1 indicates an
accumulation of frail individuals in the treatment group. A parameter value
γ 6= 1 signalizes an increase (γ > 1) or decrease (γ < 1) in the heterogeneity
by the treatment. In the above expression, σ 2 M01 (t) can be substituted by
2
S1 (t)−σ − 1 resulting in
2 λ − 1 2
S2 (t) = 1 + rγ(S1 (t)−σ − 1) + rγ (eϕt − 1)
γσ
.
ϕ
The nonparametric Kaplan–Meier estimator can now be used to estimate
the unknown survival function S1 (t) of the control group. Consequently, the
suggested model is semiparametrically in this sense.
Another extension of the gamma frailty model was recently introduced by
Barker and Henderson (2004). The authors claim that it is likely that only a
part of the covariates used in a Cox regression fulfill the proportional hazards
assumption. To circumvent this problem they introduce a mixed model that
allows for both proportional and converging hazards. Two kinds of covariates
are considered, X1 and X2 . The usual gamma frailty model with covariates
′
X2 has a hazard of the form µ(t) = Zµ0 (t)eβ2 X2 , but the frailty distribution
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 93
β ′ X1
is assumed to depend on covariates X1 , namely, Z ∼ Γ( e σ12 , σ12 ). Then the
unconditional survival function is obtained as
Z ∞
β ′ X2
S(t|X1 , X2 ) = e−zM0 (t)e 2 f (z|X1 ) dz
0
′
− eβ12X1
2 β ′2 X2 σ
= 1 + σ M0 (t)e . (3.35)
When the frailty variance σ 2 is zero, the model reduces to the standard Cox
proportional hazards model with covariates {X1 , X2 }. If β 2 = 0, the model
reduces to the proportional hazards model with covariates X1 , whereas if
β 1 = 0 holds, it is clear from equation (3.35) that the suggested mixed
model reduces to the univariate gamma frailty model with covariates X2 .
The baseline hazard function in this model is left unspecified.
In the following we a nontypical extension of the univariate gamma frailty
model is presented to demonstrate the strengths of this approach. Consider a
bivariate outcome (T, L) in the following analysis, where the first component
T is an event time, but the second one L is a nonnegative integer number.
For example, T could be the lifetime of a woman, and L the number of their
children. The aim of such an approach is to analyze both outcomes in parallel,
assuming a dependence between lifetime and number of children caused by
unobserved underlying factors influencing lifetime as well as fecundability.
The difference to a Cox regression model with number of children as covariate
is the causal relationship. If number of children is modeled as an independent
variable, and lifetime as a dependent variable, a causal relationship between
number of children and lifetime is assumed. The model presented in this
paragraph assumes an association between both variables, but no assumption
about the direction of a causal relationship is made. Let the expression
S(t|Z) = e−ZM0 (t)
denote the conditional survival function of lifetime T , conditional on the
frailty (random effect) Z. Furthermore, let p > 0, and the integer num-
ber L be conditional Poisson distributed, conditional on the random effect Z
with parameter pZ:
(pZ)l −pZ
P(L = l|Z) = e .
l!
The dependence between T and L is based on the unobserved random effect
Z. Let Z be gamma distributed with expectation one and variance σ 2 . Then
the model can be derived in the following way:
P(T > t, L = l) = EP(T > t, L = l|Z)
= ES(t|Z)P(L = l|Z)
(pZ)l −pZ
= Ee−ZM0 (t) e
l!
1
Z
= ezM0 (t) e−pz (pz)l λk z k−1 e−λz dz.
Γ(k)l!
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
94 Frailty Models in Survival Analysis
Consequently,
Now we consider the two marginal distributions in more detail. The event
time distribution is obtained by
∞
X
S(t) = P(T > t, L = l)
l=0
∞
X l+k−1 λ p
= ( )k ( )l
k−1 λ + M0 (t) + p λ + M0 (t) + p
l=0
∞
1 −k
X l+k−1 λ + M0 (t) k p
= (1 + M0 (t)) ( ) ( )l
λ k−1 λ + M0 (t) + p λ + M0 (t) + p
l=0
1
= (1 + M0 (t))−k ,
λ
which results because the infinite sum is the probability function of a negative
λ+M0 (t)
binomial distribution with parameters k and λ+M 0 (t)+p
. If now the restriction
k = λ = σ12 is used, we obtain the unconditional survival function with gamma
frailty in (3.17).
The count data distribution can be derived from the joint distribution by
l+k−1 λ k p l
P(L = l) = P(T > 0, L = l) = ( ) ( ),
k−1 λ+p λ+p
λ
which is a negative binomial distribution with parameters k and λ+p . Using
again parameterization k = λ = σ12 results in
l + σ12 − 1 12 pσ 2 l
1 σ
P(L = l) = 1 ( ),
σ2 − 1 1 + pσ 2 1 + pσ 2
1
a negative binomial distribution with parameters σ12 and 1+pσ 2 , implying
expectation p and variance p(1+pσ 2 ). The parameters p and σ 2 give the model
enough flexibility to fit the marginal distribution of the number of children
in the population. Furthermore, σ 2 can be interpreted in some sense as an
association parameter. Large values speak in favor of a strong association
between lifetime and number of children, whereas small values do not.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 95
The negative binomial distribution has been widely used to model count data.
One of the first applications was used by Greenwood and Yule (1920) to
model the number of accidents of women working in munition shell production
during World War I in England. Up to this time the Poisson distribution was
considered as a universal count data distribution based on the book by von
Bortkiewicz (1898). Greenwood and his colleagues observed discrepancies
between the observed and expected numbers of accidents and suggested an
extension of the Poisson distribution to the negative binomial distribution
by assuming the Poisson parameter to be a random mixture variable that
follows a gamma distribution. Consequently, heterogeneity was assumed in
the population. This was the starting point for a series of accident studies
based on the theory of accident proneness. The idea was that it should be
able to reduce the number of accidents by identifying the most accident-prone
individuals with the help of such heterogeneity models.
There are many applications of the univariate gamma frailty model in the
literature. For the first time the model was used by Beard (1959). Lancaster
(1979) suggested this model for the duration of unemployment spells, and
Vaupel et al. (1979) used it to calculate individual hazard functions and to
adjust life tables in the case of heterogeneous populations. Manton et al.
(1981) compared the mortality experience of heterogeneous populations, and
Manton and Stallard (1981) explained the black/white mortality crossing-over
effects observed in the United States. Manton et al. (1986) compared the
inverse normal and the gamma models, together with Gompertz and Weibull
baseline hazards, in a study of survival at advanced ages, based on data from
U.S. Medicare insurance. Aalen (1987) studied the expulsion of intrauterine
contraceptive devices. Ellermann et al. (1992) used a parametric model to
analyze recidivism among criminals. Andersen et al. (1993) applied the model
to the malignant melanoma data from Example 1.2. Jones (1998) used a
gamma–Gompertz model for analyzing the impact of selective lapsation on
mortality in life insurance. Jeong et al. (2003) used a gamma frailty to model
long-term follow-up survival data from breast cancer clinical trials when the
treatment effect diminishes over time as an alternative to the proportional
hazards model. Solid cancer incidence data from atomic bomb survivors in
Japan were used by Izumi and Ohtaki (2004) to check two different hypotheses
about individual hazard functions. Balakrishnan and Peng (2006) consider an
extension of the gamma frailty model which also includes the log-normal frailty
model considered in the following section. The use of gamma-distributed
frailty in univariate models is supported by the results of Abbring and van
den Berg (2007), who showed that, under some regularity assumptions, frailty
among survivors converges against a gamma distribution even if the original
frailty distribution is not a gamma distribution.
Parametric gamma frailty models are implemented in STATA and in the
SAS macro PGAM by Vu. Furthermore, NLMIXED in SAS can be used
based on Liu and Yu (2008). Semiparametric versions are included in R/S
plus (Therneau) and SAS macros by Vu (SPGAM) and Klein (GAMFRAIL).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
96 Frailty Models in Survival Analysis
s2
µ = EZ = em+ 2 (3.36)
2 2
σ 2 = V(Z) = e2m+s (es − 1). (3.37)
1
m = E ln(Z) = − s2
2
s2 = V(ln(Z)) = ln(1 + σ 2 ) .
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 97
2.2
s=0.5
2.0 s=1.0
s=2.0
1.8
1.6
1.4
density
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
98 Frailty Models in Survival Analysis
n Z
Y ′ δi β ′ Xi +wi
L(β, θ, s2 ) = µ0 (ti ; θ)eβ Xi +wi e−M0 (ti ;θ)e dΦ(wi ) (3.38)
i=1
The estimates of the regression parameters are very similar to each other,
indicating robustness of the analysis with respect to the choice of the baseline
hazard. Small differences occur between the estimates of the random effects
variance. The likelihood ratio test indicates that the simple exponential model
is significantly better compared to the more complex Gompertz or Weibull
model. Analysis was performed by SAS procedure NLMIXED. The variance
s2 of the random effect differs significantly from zero in all three models. It
cannot be directly compared with frailty variance σ 2 from the gamma model.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 99
log Lppl (β, s2 |W) = log Lpart (β|W) + log Lpen (s2 |W)
with W = (W1 , . . . , Wn ),
n
Y
Lpen (s2 |W) = f (Wi ; s2 )
i=1
and the density of a normal distribution with mean zero and variance s2
1 w2
f (w; s2 ) = √ e− 2s2 .
2πs2
This results in a penalty term of the log-likelihood
n
1 X Wi2
log Lpen (s2 |W) = − + ln(2πs2 ) .
2 i=1 s 2
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
100 Frailty Models in Survival Analysis
Example 3.10
The penalized partial likelihood approach is illustrated with an application to
the Halluca data. In Table 3.11 the results of a semiparametric log-normal
frailty model are given with the Cox model for comparison. We see the typical
pattern of stronger covariate effects (estimates of the regression parameters
are further away from zero) with variable type as an exception. Furthermore,
standard errors for the regression coefficients are larger in the frailty model.
Estimation was performed by means of the R procedure COXPH based on
a penalized partial likelihood approach. This procedure does not provide
a standard error for the frailty variance estimate. Comparing the results
of the log-normal frailty model and the gamma frailty model in Table 3.8,
one has to note that frailty in the log-normal model with a parameterization
based on EW = m = 0 is not standardized to EZ = µ = 1. Despite this
fact the estimates for the regression coefficients are similar in both models.
Further caution is needed when variances in the two models are compared.
The parameter σ 2 describes the variance of the frailty term Z in the gamma
model, whereas s2 denotes the variance of the random effect W = ln(Z) in
the log-normal frailty model. Both quantities cannot be directly compared.
However, both estimates are similar (σ 2 = 0.46 compared to s2 = 0.498).
In general, the variance of the random effect W in a log-normal frailty model
is larger than the variance of the respective frailty Z = eW (in the same
model), and the difference becomes more pronounced the larger the frailty
variance is. In situations with reasonable small frailty variance, there is no
big difference between σ 2 and s2 (Duchateau and Janssen 2008). The results
show robustness regarding the frailty distribution. Similar to the gamma
model, the results from parametric and semiparametric models support the
robustness of parametric frailty analysis regarding the baseline hazard.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 101
√
λ λ
(z − µ)2 .
f (z) = √ exp −
2πz 3 2µ2 z
The form of the density function for different parameter values is depicted in
Figure 3.10.
1.6
1.4
1.2
1.0
density
0.8
0.6
0.4
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
102 Frailty Models in Survival Analysis
The density given above is the starting point to derive the Laplace transform
of the inverse Gaussian distribution:
L(u) = Ee−uZ
Z √
λ −uz λ
exp − 2 (z − µ)2 dz
= √ e
2πz 3 2µ z
Z √
λ (λ + 2µ2 u)z 2 − 2µλz + λµ2
= √ exp − dz
2πz 3 2µ2 z
√
λ z λ + 2µ2 u λ λ
Z
= √ exp − 2
+ − dz
2πz 3 2 µ µ 2z
q
2
λ 1 + 2µλ u λ
= exp − +
µ µ
Z √
q
2
λ λz 1 + λ 2µ2 u λ 1 + 2µλ u λ
× √ exp − 2
+ − dz (3.39)
2πz 3 2 µ µ 2z
q
2 2µ2 u
λz 1 + 2µλ u λ 1+ λ λ λ µ
− + − =− µ2
(z − q )2 ,
2 µ2 µ 2z 2 2 z 2µ2 u
1+ λ
1+ 2µλ u
the expression
√
q
2
λ 2µ2 u
λz 1 + λ λ 1 + 2µλ u λ
√ exp − + − =
2πz 3 2 µ2 µ 2z
√
λ λ µ 2
√ exp − 2 z − 2
2πz 3 2 µ2µ2 u z 1 + 2µλ u
1+ λ
q
2µ2 u
λ 1+
r
λ λ λ 2µ2 u
L(u) = exp − + = exp 1− 1+ . (3.40)
µ µ µ λ
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 103
The first and second derivatives of the Laplace transform are given by
r
′ µ λ 2µ2 u
L (u) = − q exp 1− 1+
2 µ λ
1 + 2µλ u
and
r
′′ µ3 λ 2µ2 u
L (u) = 2µ2 u 3/2
exp 1− 1+
λ(1 + µ λ
λ )
r
µ2 λ 2µ2 u
+ 2µ2 u
exp 1− 1+ .
1+ µ λ
λ
EZ = −L′ (0) = µ
µ3
V(Z) = L′′ (0) − (L′ (0))2 = .
λ
Taking EZ = µ = 1 and V(Z) = σ 2 = 1/λ results in the following simplified
Laplace transform:
1
√
1+2σ2 u)
L(u) = e σ2 (1− .
Hence, the unconditional survival and hazard function can be written in the
form
1
√ 2
S(t) = e σ2 (1− 1+2σ M0 (t))
and
µ0 (t)
µ(t) = .
(1 + 2σ 2 M0 (t))1/2
′
In the following we include regression terms eβ X into the model. The aim is
to evaluate the effect of unobserved heterogeneity on the regression parameters
in the inverse Gaussian frailty model. We consider the marginal hazards of
two individuals with only one binary covariate in the model. The hazard
ratio for two individuals with covariate values one (meaning, for example,
experimental treatment) and zero (for example, placebo treatment) is of the
form
µ(t|X = 1) (1 + 2σ 2 M0 (t))1/2 β
= e .
µ(t|X = 0) (1 + 2σ 2 M0 (t)eβ )1/2
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
104 Frailty Models in Survival Analysis
This hazard ratio is eβ at time point t = 0 and converges towards eβ/2 for
t → ∞. Similar to the gamma frailty model, this marginal model is no longer
a proportional hazards model. The effect of the covariate weakens over time.
The time-independent hazard ratio obtained from the conditional model is
attenuated from the conditional hazard ratio eβ . The influence of unobserved
heterogeneity is stronger in situations with larger values of σ 2 , β, or M0 (t),
representing the three factors that accelerate selection.
The density of the frailty distribution among the survivors at time point t
can be written in the form
2 2 β′ X
exp − z (1+2σ M2σ 0 (t)e
2z
)−2z+1
= √ p
2π 2 z 3 exp σ12 (1 − 1 + 2σ 2 M0 (t)eβ X )
′
′ 2
1 z − (1 + 2σ 2 M0 (t)eβ X )−1/2
= √ exp − 2σ2 z
.
2π 2 z 3 1+2σ2 M0 (t)eβ ′ X
1
E(Z|X, T > t) = p
1 + σ 2 M0 (t)eβ
′X
σ2
V(Z|X, T > t) = .
(1 + σ2 M0 (t)e
β ′ X )2
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 105
with z ≥ 0 and 0 < γ ≤ 1. This expression is a power series, converging fast for
large values of z, and slow for small values of z. In the special case of γ = 1, the
frailty distribution becomes degenerated at the point mass Z = 1. Although
the probability density function of a random variable with positive stable
distribution can only be represented by infinite series, the Laplace transform
has a very simple form. This makes the distribution especially attractive as a
frailty distribution because many characteristics of the unconditional survival
distribution of the event times follow from the Laplace transform and can
easily be deduced:
γ
L(u) = e−u . (3.41)
All moments of this distribution are infinite. Consequently, the expectation
of the frailty is infinite, and variance does not exists. In the first view this
seems to be a disadvantage because an infinite expectation is more difficult
to work with. However, infinite expectation was one of the main reasons why
this frailty distribution was introduced. The first derivative of the Laplace
transform is used to show this fact for the expectation (Duchateau and Janssen
2008)
γ
′ e−u
lim L (u) = −γ lim 1−γ = −∞,
u→+0 u→+0 u
where lim denotes the right limit. Here the comparison of the population
u→+0
hazard function with the conditional hazard function for a subject with frailty
value one makes no sense because the expectation of the frailty variable does
not exist. Consequently, an individual with frailty equal to one cannot serve
as a standard individual as in other frailty models with finite expectation of
the frailty. This infinite expectation result is also important with respect to
identifiability issues treated by Elbers and Ridder (1982). They found that a
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
106 Frailty Models in Survival Analysis
finite mean of the frailty distribution is one condition (among others) for the
identifiability of univariate frailty models. This was the main reason why the
positive stable distribution was introduced as a frailty distribution. Especially
in bivariate/multivariate applications, much attention is given to overcoming
confounding problems in shared (gamma) frailty models, considered later in
this book (for more details see Section 4.9). Using the Laplace transform of
a positive stable frailty variable given earlier, the unconditional survival and
density functions are
γ
S(t) = e−M0 (t) (3.42)
and
γ
f (t) = γµ0 (t)M0 (t)γ−1 e−M0 (t) ,
The positive stable model implies some interesting features, which are given
in the following examples. The most important one is that the positive stable
distribution is the only frailty distribution that preserves the proportional
hazards condition in unconditional hazards after integrating out the frailty.
In the following we introduce observed covariates to the model. In the case of
only one binary 0-1 covariate, the ratio of two hazards with different individual
covariate values is
µ(t|X = 1)
= eγβ .
µ(t|X = 0)
The two marginal hazards are still proportional but with proportionality
factor eγβ instead of eβ . This shows that parameter estimates are biased
in a proportional hazards model if relevant covariates are not included. Since
0 < γ < 1, the population hazard ratio will typically be closer to one. The
more the parameter γ deviates from one, the more the population hazard ratio
will deviate from the conditional hazard ratio.
Example 3.11
Let M0 (t) = λtν (Weibull) and assume that the frailty variable is positive
stable distributed with parameter γ. Then the resulting survival function is
given by
γ γ γν
S(t) = e−M0 (t) = e−λ t
.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 107
Example 3.12
In the following, the positive stable frailty model is applied to the Halluca
data. The survival and hazard function, respectively, are given by (3.42) and
(3.43) for a parametric analysis. Results can be found in Table 3.12. The
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
108 Frailty Models in Survival Analysis
z 1
f (z) = e−λ(1−γ)( µ − γ )
∞
1X (λ(1 − γ))κ(1−γ) µκγ Γ(κγ + 1) −κγ−1
× (−1)k+1 z sin(κγπ)
π κ=1 γκ κ!
The derivation of the Laplace transform from this density function is difficult
and therefore omitted here. The interested reader is referred to Aalen (1992).
The Laplace transform is
λ(1−γ) µu
1−(1+ λ(1−γ) )γ
L(u) = e γ . (3.44)
The first and second derivatives are given by
µu λ(1−γ) µu
1−(1+ λ(1−γ) )γ
L′ (u) = −µ(1 + )γ−1 e− γ
λ(1 − γ)
and
µ2 µu λ(1−γ) µu
)γ
′′
L (u) = (1 + )γ−2 e− γ 1−(1+ λ(1−γ)
λ λ(1 − γ)
µu λ(1−γ) µu
)γ
+ µ2 (1 + )2γ−2 e− γ 1−(1+ λ(1−γ)
λ(1 − γ)
µ2
V(Z) = L′′ (0) − (L′ (0))2 =
. (3.46)
λ
The positive stable distribution can be obtained as a special case of the PVF
distribution. To show this fact, some asymptotic considerations are necessary.
Following the presentation in the book by Duchateau and Janssen (2008), we
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 109
λ(1 − γ) µu γ
log L(u) = 1− 1+
γ λ(1 − γ)
λ(1 − γ) µ γ λ(1 − γ) γ
= 1− +u
γ λ(1 − γ) µ
λ(1 − γ) µ γ λ(1 − γ) γ λ(1 − γ) γ
= − +u (3.47)
γ λ(1 − γ) µ µ
µγ γ
γ−1
−−−−−−−→ ,
λ µ→∞,λ→0 (1 − γ)1−γ
the first factor in the expression (3.47) converges against one, and the second
one goes to −uγ . This is the logarithm of the Laplace transform of a positive
stable-distributed random variable.
Using the standard assumption in frailty models that EZ = µ = 1, and
introducing the frailty variance as a model parameter by the relationship
V(Z) = λ1 := σ 2 , the Laplace transform (3.44) of a PVF random variable
becomes
1−γ σ2 u γ
1−(1+ 1−γ )
L(u) = e γσ2 .
This implies the unconditional survival and hazard function in the PVF frailty
model
σ2 M0 (t) γ
1−γ
2 1−(1+ )
S(t) = e γσ 1−γ
and
µ0 (t)
µ(t) = σ2
.
(1 + 1−γ
1−γ M0 (t))
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
110 Frailty Models in Survival Analysis
single binary covariate with possible individual values zero and one in the
model. Then the ratio of the marginal hazards becomes
2
σ
µ(t|X = 1) (1 + 1−γ M0 (t))1−γ β
= σ 2 e ,
µ(t|X = 0) (1 + 1−γ M0 (t)eβ )1−γ
S(t|z)f (z)
f (z|T > t) =
S(t)
exp − zM0 (t) 1−γ
(z− γ1 )
= 1−γ σ2 M0 (t) γ
e− σ2
∞
1X (1 − γ)κ(1−γ) Aκγ(γ−1) Γ(κγ + 1) −κγ−1
× (−1)κ+1 z sin(κγπ)
π κ=1 (σ 2 A−γ )κ(1−γ) γ κ κ!
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 111
with parameters µ > 0, λ > 0 and (different from the PVF distribution) γ < 0.
An interesting property of the model is that it allows for a subgroup of zero
frailty, which never experiences the event of interest. In the case of total
mortality, this is impossible because nobody is immortal. But this model is
relevant to medicine and demography, for example, when considering cause-
specific mortality or the occurrence of a disease. Despite the fact that the
density of the continuous part is only given as an infinite series that has to be
calculated numerically, the distribution is mathematically convenient. It may
also be seen as a natural choice. The distribution can be constructed as the
sum of a Poisson-distributed number of independent and identically gamma-
distributed random variables. This can be viewed as a hit model where each
individual experiences a random number of hits, each of random size. More
formally, the frailty Z of individuals in the study population is
V1 + V2 + . . . + VN : if N > 0,
Z= (3.49)
0 : if N = 0,
L(u) = Ee−uZ
∞
X
E e−u(V1 +...+VN ) |N = κ P(N = κ)
= P(N = 0) +
κ=1
∞ Y
κ
X u −k ρκ −ρ
= e−ρ + (1 + ) e
κ=1 i=1
λ κ!
∞
X (ρ(1 + uλ )−k )κ
= e−ρ
κ=0
κ!
u −k
= e−ρ(1−(1+ λ ) )
. (3.50)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
112 Frailty Models in Survival Analysis
u γ−1 kλγ u
L′ (u) = −kλγ−1 (1 + ) exp[ {1 − (1 + )γ }]
λ γ λ
γ
u kλ u
L′′ (u) = −kλγ−2 (γ − 1)(1 + )γ−2 exp[ {1 − (1 + )γ }]
λ γ λ
γ
u kλ u
+ k 2 λ2γ−2 (1 + )2γ−2 exp[ {1 − (1 + )γ }].
λ γ λ
EZ = kλγ−1 (3.51)
γ−2
V(Z) = k(1 − γ)λ . (3.52)
1−γ
Using the standard relation EZ = 1 and σ 2 = λ , the Laplace transform
becomes
1 − γ σ 2 u γ
L(u) = exp 2
1 − (1 + ) . (3.53)
γσ 1−γ
This Laplace transform implies the marginal survival and hazard function in
the case of a compound Poisson frailty model given by
σ2 M0 (t) γ
1−γ
2 1−(1+ )
S(t) = e γσ 1−γ
(3.54)
and
µ0 (t)
µ(t) = σ2
.
(1 + 1−γ
1−γ M0 (t))
Because the Laplace transforms of the PVF distribution and the compound
Poisson distribution are equal (except for the range of parameter γ, which can
be negative in the compound Poisson model), the properties obtained from the
Laplace transform are the same. Consequently, the population survival and
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 113
hazard functions of the compound Poisson frailty model coincides with the
respective functions in the PVF frailty model. Also, the ratio of the marginal
hazards is similar to the PVF model when introducing a single binary observed
covariate into the model
σ2
1−γ
µ(t|X = 1) 1 + 1−γ M0 (t) β
= 1−γ e ,
µ(t|X = 0) σ2
1 + 1−γ M0 (t)e β
1 1
= Γ(κγ + 1) sin(κγπ).
Γ(−κγ) π
This is the density given in (3.48) with λ = σ12 . It should be noted that
the integral of µ(t) over [0, ∞) is finite if γ < 0. Consequently, the survival
function is incomplete because a fraction of individuals has zero frailty and
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
114 Frailty Models in Survival Analysis
will never experience the event under study. The size of the nonsusceptible
fraction is given by
1−γ σ2 1−γ
1−(1+ 1−γ M0 (t))γ
S(t) = e γσ2 −−−→ e γσ2 .
t→∞
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 115
Example 3.13
The proportional hazards model with Gompertz baseline hazard suggests that
women face a lower risk of death caused by malignant melanoma compared
to men. Increasing age and tumor size result in higher mortality. The gamma
frailty model implies an increase in the effect of the covariates gender and
thickness, whereas the effect of age is decreased. Standard errors are also
increased. The heterogeneity estimate σ 2 is not significantly greater than
zero. The compound Poisson/PVF frailty model extends the gamma frailty
model. Estimates of the regression parameters are different from the estimates
in the gamma frailty model. The compound Poisson model shows only a
nonsignificant improvement in the fit compared to the gamma frailty model
(γ = 0), which is also supported by the large standard error of the estimate
of parameter γ. Comparing all three models, the proportional hazards model
with Gompertz baseline shows the best fit with respect to the likelihood ratio
test. The estimate of γ is negative, implying that this is a compound Poisson
frailty model. Consequently, some patients will never die from this disease.
This is possible because only death by malignant melanoma was studied and
not total mortality. That means a part of the population seems to be protected
against mortality from malignant melanoma. The estimated proportion with
1−γ
zero risk is around e γσ2 = 56%. A possible conclusion would be that this
is a cure or nonsusceptible fraction. But caution is necessary, especially in
the present case where the compound Poisson frailty model does not show a
significant better fit compared to the gamma frailty model.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
116 Frailty Models in Survival Analysis
L(u) = E exp(−uW 2 )
Z∞ (w − m)2
1
= √ exp(−w2 u) exp − dw
2πs 2s2
−∞
Z∞ 2s2 w2 u + w2 − 2mw + m2
1
= √ exp − dw
2πs 2s2
−∞
Z∞ w2 − 2mw
+ m2 m2 m2
1 (1+2s2 u)2 + 1+2s2 u −
1+2s2 u (1+2s2 u)2
= √ exp − s2
dw
2πs 2 1+2s 2u
−∞
m2 m2
1 1+2s2 u − (1+2s2 u)2
= √ exp − s2
1 + 2s2 u 2 1+2s 2u
Z∞ (w − m 2 )2
1 1+2s u
× √ s
exp − s2
dw
2π 1+2s2 u
√ 2 1+2s 2u
−∞
1 m2 u
= √ exp − , (3.55)
1 + 2s2 u 1 + 2s2 u
where the last equation holds because the integral equals to one as it is over
the density of a normal distribution. It is easy to see that, in case of m = 0,
the frailty Z = W 2 is gamma distributed with shape parameter k = 1/2 and
form parameter λ = 1/2s2.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 117
m u 2 m u 2
7 − 1+2s 5 − 1+2s
L′′ (u) = 2m2 s2 (1 + 2s2 u)− 2 e 2u
+ 3s4 (1 + 2s2 u)− 2 e 2u
m2 u m2 u
7 − 7 −
= 3m2 s2 (1 + 2s2 u)− 2 e 1+2s2 u + m2 s2 (1 + 2s2 u)− 2 e 1+2s2 u
m u 2
11 − 1+2s
= m4 (1 + 2s2 u)− 2 e 2u
EZ = −L′ (0) = m2 + s2
and
1 m2 M0 (t)
S(t) = p exp − ,
1 + 2s2 M0 (t) 1 + 2s2 M0 (t)
resulting in the unconditional hazard function
m2 s2
m(t) = m0 (t) + .
(1 + 2s2 M0 (t))2 1 + 2s2 M0 (t)
One nice feature of this model is the Gaussian property of the conditional
distribution P(W ≤ w|T > t):
P(T > t, W ≤ w)
P(W ≤ w|T > t) =
S(t)
Z w
1
= P(T > t|s)fW (s) ds
S(t) −∞
Z w
1 1 (s − m)2
= √ exp(−s2 M0 (t)) exp − ds
S(t) 2πs −∞ 2s2
s2 − 2ms m2
1+2s2 M0 (t) + (1+2s2 M0 (t))2
Z w
1 1
= √ exp − 2 ds
S(t) 2πs −∞ 2 1+2ss2 M0 (t)
(s − m 2
1+2s2 M0 (t) )
Z w
1
= √ s
exp − 2 ds.
2π √ 2
−∞ 2 1+2ss2 M0 (t)
1+2s M0 (t)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
118 Frailty Models in Survival Analysis
m s2
W |(T > t) ∼ N 2
, 2
.
1 + 2s M0 (t) 1 + 2s M0 (t)
The common constraint EZ = EW 2 = 1 implies the relation s2 + m2 = 1,
which restricts possible values of m and s2 and limits the applicability of the
model to real-life problems. Substituting m2 by the expression 1 − s2 (and
keeping in mind s2 ≤ 1) results in the following unconditional survival and
hazard functions:
(1−s2 )M0 (t)
1 − 1+2s2 M0 (t)
S(t) = p e (3.56)
1 + 2s2 M0 (t)
and
1 + 2s4 M0 (t)
µ(t) = µ0 (t) . (3.57)
(1 + 2s2 M0 (t))2
To quantify the effect of unobserved heterogeneity in the quadratic hazard
frailty model, observed covariates are introduced into the model. Considering
only one single binary 0-1 covariate in the model, the ratio of the marginal
hazards becomes
V(Z) = EW 4 − (EW 2 )2
= 2s4 + 4m2 s2
= 2s2 (2 − s2 ). (3.58)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 119
Example 3.14
The quadratic hazard frailty model is applied to the Halluca lung cancer data
from Example 1.3. The hazard and survival functions have an explicit form in
this model (see (3.56) and (3.57)) and can be used in the likelihood function
for parametric analysis. Similar to other frailty models considered before a
Weibull, exponential, and Gompertz baseline hazard function is used, and the
results of the analysis are given in Table 3.14.
Estimates of the regression parameters are very similar in the three models,
indicating robustness of the analysis with respect to the choice of the baseline
hazard function. The values of the log-likelihood function indicate that the
Weibull and Gompertz models do not fit the data significantly better than the
exponential model. Only small differences occur between the estimates of the
random effects variance. In general, the variance of the random effects is small
(but significantly away from zero in all three models). Using formula (3.58),
the variance of the frailty term σ 2 is around 0.3. Analysis was performed by
maximizing the parametric log-likelihood function using the SAS procedure
NLMIXED.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
120 Frailty Models in Survival Analysis
L(u; s) = e−sψ(u) .
This is a valid Laplace transform for all nonnegative s when function ψ is a
characteristic exponent of the Lévy process. Here s denotes the time which
the frailty generating Lévy process has been running prior to the follow-up.
Larger values of s indicate higher risks because of a longer running damage
process. More details about Laplace exponents and Lévy processes can be
found in Bertoin (1996). The family of Lévy processes contains a number of
interesting special cases such as compound Poisson processes, PVF processes,
gamma processes, stable processes, etc. The survival function in this model
is given by
One can see that s is a proportionality parameter and there is a natural link to
the proportional hazards model. Of course, the special Laplace transform of a
Lévy process produces this result. Aalen and Hjort (2002) present a number
of examples of frailty models based on Lévy processes. The majority of the
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 121
L(u; s) = e−sρ(1−(1+ λ ) ) .
u −k
Here it is clearly to see that the additional parameter s influences only ρ from
the original model which is the parameter of the Poisson distribution of the
number of damaging hits. The parameters k and λ describing the gamma
distribution of the hits are not influenced by s. It is necessary to note that
the compound Poisson frailty distribution in this model depends on s and is
therefore no longer restricted to expectation one. In the gamma frailty model
as another example the characteristic exponent is of the form
u
ψ(u) = k ln(1 + ) (3.59)
λ
which gives
u −sk
L(u; s) = (1 +
) .
λ
This is the Laplace transform of a gamma distribution Γ(sk, λ). Consequently,
s influences only the shape parameter of the gamma distribution, but not the
form parameter. Again, the restriction of zero expectation does no longer hold
for different values of s if the other two parameters are fixed. For different
values of s proportional hazards are obtained, which is clearly a nice and
interesting feature of this approach, especially when event times of different
groups need to be compared.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
122 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 123
times that incorporates a cured fraction is given by Berkson and Gage (1952):
S ∗ (t) = (1 − φ) + φS(t).
Longini and Halloran (1996) have proposed frailty cure (cure-mixture) models
that extends a model by Farewell (1977). In their model the frailty variable
has point mass at zero with probability 1− φ while heterogeneity among those
experiencing the event of interest is modeled via a continuous distribution with
probability φ. In the gamma frailty cure model, the survival function S of the
susceptible individuals is substituted by the marginal survival function in the
gamma frailty model (3.17):
2
S ∗ (t) = (1 − φ) + φ(1 + σ 2 M0 (t))−1/σ . (3.60)
The idea behind this model is similar but not equivalent to the compound
Poisson frailty model suggested by Aalen (1988, 1992). In the model by
Longini and Halloran (1996), the shape of the frailty distribution is assumed
to be independent of the size of the cure fraction, while in the compound
Poisson frailty model there exists a connection between the shape of the frailty
density and the probability of susceptibility.
An interesting work about cure models is that of Maller and Zhou (1996).
Price and Manatunga (2001) give a comprehensive introduction to the area
and apply different cure, frailty, and frailty cure models to leukemia remission
data. They conclude that frailty models are useful in modeling data with a
cured fraction and found that the gamma frailty cure model provides a better
fit to their remission data compared to the standard cure model.
The next example provides an extension of the foregoing model to include
censored observations. Consider two types of expressions for a disease: the
incidence and the age of disease onset. Risk models for overall susceptibility
(lifetime risk) that consider only the first expression by treating the disease
as a binary trait of being affected or not can give wrong results. The reason
is, that for individuals without the disease, due to censoring, it is often not
known whether they will eventually develop the disease. On the other hand,
models from survival analysis typically assume that everyone has the same
susceptibility to the disease and will eventually be effected if followed up for
a sufficiently long period of time. It is possible that these models do not
accurately describe the disease risk factors. In models dealing with both
types of expressions, the effect of a covariate can act on either the overall
susceptibility or the age at onset or both. Such a model is considered later in
(3.63). The following model is a special case of the binary frailty model. Here,
one of the two point masses is located at zero, resulting in a cure fraction with
hazard zero with respect to the event of interest.
The application of mixture models for joint modeling of the overall risk of a
disease and the age-at-onset distribution of the individuals at risk is popular
(Farewell 1977, Kuk and Chen 1992, Lam et al. 2005). We define an individual
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
124 Frailty Models in Survival Analysis
and let T denote the age at onset when Y = 1. With the foregoing concept,
let φ = P(Y = 1) and S(t) = P(T > t|Y = 1) describe the distribution of Y
and of the failure time T . It is impossible to observe Y , but it is possible to
observe whether or not a subject has experienced the event during the period
of its follow-up time.
• For the other observations, a failure is not observed (∆ = 0). This may
occur either because Y = 0 or because the observation is really censored.
Therefore, P(Y = 0) + P(Y = 1)P(T > C|Y = 1) = (1 − φ) + φS(C).
This is a generalization of (2.6) and (2.7) because these relations are easily
obtained with φ = 1.
Example 3.15
Three different frailty models are applied to age at onset of breast cancer in
11,714 Swedish female twins described in Example 1.6. The results are given
in Table 3.15.
Table 3.15: Frailty and frailty cure models for the breast cancer
data of Swedish twins
parameter gamma gamma cure compound Poisson
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 125
In the first column of Table 3.15, the gamma frailty model is applied to account
for heterogeneity in the study population. This model is described in detail
in Section 3.3. The model ignores the existence of a cure fraction, and the
parameter φ = 1 indicates that all individuals are (more or less) susceptible.
This model is extended to the gamma frailty cure model (second column, see
(3.60)), which allows for a fraction of nonsusceptible individuals, for example,
patients who are not at risk of suffering from breast cancer. The third model is
the compound Poisson frailty model (Section 3.8). The size of the susceptible
fraction is calculated by φ = 1 − exp( 1−γ
γσ2 ). That is why no standard error is
given for this quantity. All models are parametric models using a Gompertz
baseline hazard function with parameters λ and ϕ.
In the gamma frailty model, the size of the susceptible fraction is 100%
per definition as in classical survival analysis. The gamma frailty cure model
gives an estimate of 22.1% of the size of the susceptible fraction, for example,
only 22.1% of all women are at risk for breast cancer. The respective number
calculated from the compound Poisson frailty model is similar, with 23.7%.
The gamma frailty model is a special case of the gamma frailty cure model
(when φ = 1) and the compound Poisson frailty model (when γ = 0). These
estimates are in good agreement with results found by Chatterjee and Shih
(2001) and Wienke et al. (2003a) in bivariate analyses and estimates of a
lifetime risk of breast cancer of around 8–12% (Feuer et al. 1993, Rosenthal
and Puck 1999, Ries et al. 1999) in current Western populations. The latter
numbers give a lower boundary for the size of the susceptible fraction. A
huge estimate of σ 2 in the gamma model indicates the existence of large
heterogeneity in the study population, which is at least partially accounted
for by the introduction of a nonsusceptible fraction in the gamma frailty cure
model, where the heterogeneity is smaller, but still large. In the compound
Poisson frailty model, the estimate of γ is negative, indicating the existence of
a nonsusceptible fraction. However, differences between the three models are
not significant in terms of the log-likelihood function. This speaks in favor of
the most simple gamma frailty model. To compare the nested gamma model
(γ = 0) with the compound Poisson model standard test theory can be applied
because the value γ = 0 is no longer on the boundary of the parameter space.
The fit of the compound Poisson frailty model to the Swedish breast cancer
data is demonstrated in Figure 3.11.
It is necessary to mention that the correlation between the ages at onset of
breast cancer in the twin pairs is neglected in these univariate models. We
will reanalyze the data using a bivariate model later on to account for the
correlated observation times.
Model (3.60) was extended to the mixture cure frailty model with gamma
frailty by Peng and Zhang (2008a):
′ 2
S ∗ (t|X1 , X2 ) = (1 − φ(X2 )) + φ(X2 )(1 + σ 2 M0 (t)eβ1 X1 )−1/σ . (3.63)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
126 Frailty Models in Survival Analysis
0.98
survivor function
0.96
0.94
0.92
0.90
35 45 55 65 75 85 95
age
Here X2 denotes the vector of covariates that may have effects on the cure
rate. The vector X1 represents the covariates that influence the event times
of those individuals who are susceptible to the event. Without frailty, the
model reduces to the proportional hazards model with cure fraction. Without
cure fraction, the model becomes the gamma frailty model considered earlier.
Without both cure fraction and the frailty the common proportional hazards
model is obtained. If the probability of cure depends on observable covariates,
a common way to introduce this into the model is logistic regression (Farewell
1982)
′
eβ2 X2
φ(X2 ) = ′
1 + eβ2 X2
with β2 as the vector of regression parameters to be estimated. Other link
functions such as the probit link and the log–log link can also be used.
Cure models suffer from an inherent identifiability problem due to right-
censored observations. The event under study has not occurred at the end of
the follow-up either because the person is nonsusceptible or because the person
is susceptible, but follow-up was not long enough to observe the event. A key
component of identifiability is the presence of covariates in both components of
the mixture model. In fact, Li et al. (2001) show that, if the baseline hazard
is nonparametric and independent of covariates, and if the cure fraction is
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 127
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
128 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Univariate Frailty Models 129
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
130 Frailty Models in Survival Analysis
β = (1 + σ 2 )β ∗ , (3.65)
∗
where β, β denote the unbiased and biased parameters, respectively, in the
Cox model, assuming that the covariates are centered. Consequently, the
bias due to omitted covariates becomes greater with increasing heterogeneity
(frailty variance), and relation (3.65) could be used to compensate for the
frailty effect.
Congdon (1995) investigates the influence of different frailty distributions
(gamma, inverse Gaussian, stable, and binary) on the analysis of cause-specific
and total mortality data from the London area during the years 1988 – 1990
by using Weibull and Gompertz baseline hazard functions.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Chapter 4
Shared Frailty Models
So far we have focused on the frailty model as a way of dealing with possible
heterogeneity due to unobserved covariates. This is the main interpretation
of frailty in the application to univariate time-to-event data. As discussed in
the last chapter, this results in selection over time; for example, this is shown
as leveling-off or crossing-over effects in population hazards. The concept of
frailty introduced by Vaupel et al. (1979) to biostatistics and by Lancaster
(1979) to the econometric literature originates from this kind of models.
Another, completely different, aspect of this approach is to use the frailty
term to model associations between event times, which goes back to the work
of Clayton (1978). Implicitly, most of the statistical models and methods for
failure time data, and here especially the Cox proportional hazards model,
were developed under the assumption that the observations from different
subjects are statistically independent of each other. While this is sensible in
many applications, it has become obvious that this assumption does not hold
in certain situations that are more common as originally thought.
In the following section the two main approaches in multivariate survival
analysis – marginal and frailty models – are outlined and compared, followed
by a description and discussion of the shared frailty model in the subsequent
sections. The shared frailty model is a mixture model because the common
risk in each cluster (the frailty Z) is assumed to be random. The model
assumes that all event times in a cluster are independent given the frailty
variables. In other words, it is a conditional independence model where the
frailty is common to all individuals in a cluster and therefore responsible for
creating dependence between event times. This is the reason for the concept
of shared frailty. A shared frailty model can be considered as a mixed (random
effects) model in survival analysis with group variation (frailty) and individual
variation described by the hazard function. In contrast, mixed models show
a more symmetric handling of these two sources of variation. Because of the
censored observations the Cox model and the frailty models do not belong
to the class of generalized linear mixed models. It is assumed that there is
independence between the observations from different clusters. If the variation
of the frailty variable is zero, this implies independence between event times
in the clusters; otherwise, there is positive dependence between event times.
A more detailed presentation of shared frailty models can be found in the
excellent book by Duchateau and Janssen (2008).
131
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
132 Frailty Models in Survival Analysis
• Family studies are critical in assessing the role which genetics play in the
disease process. Statistical methods such as variance component models
and path analysis have been developed and adopted to analyze family
data with quantitative traits such as cholesterol. Variance component
models aim to measure the extent to which the total variation in the
quantitative trait is due to a correlation between relatives, as well as the
degree of correlation among full siblings and other relatives. When the
considered trait is onset age of a disease, conventional methods are not
appropriate. This is mainly due to censoring/truncation of this variable
but also to the need for a measure of within-family correlation that
incorporates time, a feature not shared by conventional measures such
as the correlation coefficient.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 133
These three examples share an important feature, namely, that the failure
times for observations from the same cluster correlate with one another. In
Examples one and three the cluster is an individual, in Example two it is a
family. In these examples, the cluster sizes are small relative to the number
of clusters. These three examples, however, differ from one another in terms
of their scientific objectives. The main objective of examples one and three
is to examine the effectiveness of a new treatment, which presumably can be
characterized through regression modeling. The within-cluster correlation in
these examples is usually of secondary interest, although ignoring it could lead
to erroneous conclusions. The within-family association, for example, two is
of primary interest, although regression adjustment for each related subject
is critical in order to minimize the potential that the observed association is
mainly due to environmental factors shared by family members. Furthermore,
the mechanisms behind within-cluster associations may vary so that different
statistical models to describe the associations may be needed. It is clear that
the mechanism leading to the correlation between two fellow eyes is different
from that attributed to the correlation of observations measured over time
from the same eye.
Another typical example with clustered event times are multicenter clinical
trials with event-time outcome. Here the treatment centers are the clusters.
Event times of patients in a cluster are assumed to be correlated because of
center-specific differences in treatment, care, and diagnosis of the patients, or
in the composition of the center-specific study populations. The lung cancer
data from the Halluca study provide such an example. Dependent on the
prevalence of the disorder of interest, there are sometimes many centers with
only a few patients (rare diseases) or a few centers with a large number of
patients each (common diseases).
Available statistical models fall into two broad classes – marginal and frailty
models (Wei and Glidden 1997). Marginal methods specify models for the
effect of covariates on the hazards of the individuals (the margins) under the
working independent assumption. That means, for the point estimates of the
parameters, independence between the event times in a cluster is assumed.
To take into account the fact that observed event times are correlated, the
variance estimates need to be adjusted but without the need for explicitly
modeling the correlation (Wei et al. 1989, Lee et al. 1992, Cai and Prentice
1995). The association between the events is considered a nuisance parameter.
The marginal baseline hazards can be modeled differently (Wei et al. 1989)
or with a common functional form (Lee et al. 1992). As with the analysis
of longitudinal data, regression parameters are estimated from generalized
estimating equations, and the corresponding variance–covariance estimators
are corrected properly to account for the dependence structure. Yu and Peng
(2008) suggested a marginal model with cure fraction where both the margins
as well as the cure fraction may depend on observed covariates. An excellent
and detailed review of the robust and well-developed marginal approach is
given in Lin (1994).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
134 Frailty Models in Survival Analysis
The other commonly used and general approach to the problem of modeling
multivariate data is the specification of independence among observed data
items conditional on a set of unobserved or latent variables (random effects).
A multivariate model for the observed data is then induced by averaging over
an assumed distribution for the latent variables. The dependence structure
in the multivariate model arises when dependent latent variables enter into
the conditional models for multiple observed data items, and the dependence
parameters may often be interpreted as variance components. Frailty models
for multivariate survival data are derived under a conditional independence
assumption by specifying latent variables that act multiplicatively on the
baseline hazard function. Different assumptions about the distribution of
the random effects can create different dependence structures in the cluster
as discussed in detail for the shared frailty model in the book by Duchateau
and Janssen (2008). This popular approach with its strong link to the field of
generalized linear mixed models is one of the building blocks for the present
monograph.
Two main differences between the two approaches should be kept in mind.
First, the interpretation of regression parameters in both models is different.
In the marginal model, the parameters describe the population level relative
risk, whereas in the frailty model, the parameters have a cluster-level relative
risk interpretation. That is, in the latter approach, the hazard ratio refers to
comparisons within clusters where individuals share the same frailty. In the
marginal approach, the hazard ratio refers to comparisons between individuals
randomly drawn from the total study population, independent of which cluster
the individuals belong to. As such, the estimates are not expected to be the
same in marginal and frailty models since they estimate different quantities,
unless the within-cluster correlation is zero (meaning that clustering does not
play any role) and both models comply with the Cox model for independent
data.
Second, the marginal approach, although effective, can predict only marginal
survival probabilities of single individuals. In contrast, in the frailty approach
it is possible to perform joint prediction of survival for individuals from the
same cluster. Frailty models also allow the prediction of survival based on the
current status of the other individuals in the cluster. Such kind of analysis
is of special interest in small clusters, for example, in event times of sibships.
Practical comparisons of both methods show that the confidence intervals for
regression parameter estimates are smaller in frailty models compared to that
in the marginal approach (Chuang and Cai 2006). This is based on the fact
that the frailty approach makes stronger model assumptions by specifying the
correlation structure.
An interesting approach occupying an intermediate position between the
two aforementioned models to allow estimation both regression coefficients
with traditional interpretation as well as correlations between event times is
described by Mahé and Chevret (1999).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 135
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
136 Frailty Models in Survival Analysis
′
µ(t|Xij , Zi ) = Zi µ0 (t)eβ Xij
,
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 137
where µ0 (t) denotes the baseline hazard function, and β is a vector of fixed-
effect parameters to be estimated. The frailties Zi (i = 1, . . . , n) are assumed
to be independently and identically distributed random variables with density
function f (z). The frailty density depends on unknown parameters to be
estimated. Similar to the univariate case a semiparametric shared frailty
model is one with a nonparametric baseline hazard µ0 (t). Furthermore,
please note that, with the foregoing notation, the case of event times from
clustered individuals as well as the case of recurrent event times in individuals
is covered. However, in the analysis of recurrent event-time data, there may be
specific information about the ordering of the events, and the time scale used
needs careful consideration. For specific aspects with shared frailty models in
recurrent event times, see Chapter 9 of Hougaard (2000) and Duchateau et
al. (2003).
The main assumption of a shared frailty model is that all individuals in
cluster i share the same value of frailty Zi (i = 1, . . . , n), and this is why
the model is called the shared frailty model. The lifetimes are assumed to be
conditionally independent with respect to the shared (common) frailty. This
shared frailty is the cause of dependence between lifetimes within the clusters.
We derive in the following paragraphs the quantities based on this conditional
formulation. Independence of the lifetimes within the clusters corresponds to
a degenerate frailty distribution (no variability in Zi ). In all other cases, the
dependence is positive. It is assumed that there is independence between event
times from different clusters. If condition P(Zi > 0) = 1 holds, the shared
frailty model leads to absolute continuous distributions and thus cannot model
dependence due to common events. Consequently, it is not appropriate for
event-related dependence (shock models) because an event in one individual
is not relevant to the partner; it only changes the information available on
the frailty. One exception to this general assumption is the shared compound
Poisson frailty model, which does not fulfill the condition P(Zi > 0) = 1. In
this specific model, it is possible that all individuals in a cluster share the
frailty value zero (with positive probability) and are nonsusceptible to the
event of interest (cure model).
Using an argument similar to that in equation (3.5), we can derive the
joint conditional multivariate survival function for the individuals in the ith
cluster. Conditional on frailty Zi which is shared by all individuals in cluster
i, it holds that
Rt
where M0 (t) = 0 µ0 (s) ds denotes the cumulative baseline hazard function
and Xi = (Xi1 , . . . , Xini ) is the covariate matrix of the individuals in the ith
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
138 Frailty Models in Survival Analysis
cluster. This is the starting point to derive the unconditional joint survival
function. Averaging expression (4.1) with respect to the frailty Zi gives the
marginal survival function:
where L denotes the Laplace transform of the frailty variable. Thus, the
multivariate survival function is expressed as the Laplace transform of the
frailty distribution, evaluated at the cumulative baseline hazard. The joint
survival function for all event-time data is now the product of the survival
functions of all the clusters because of the assumption about independence
between clusters
n
Y ni
X ′
M0 (tij )eβ Xij .
S(t11 , . . . , tnnn |X1 , . . . , Xn ) = L
i=1 j=1
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 139
ni
X
′
S(ti1 , . . . , tini |Xi ) = L M0 (tij )eβ Xij
j=1
ni − 1
σ2
X ′
= 1 + σ2 M0 (tij )eβ Xij
j=1
ni
X − 12
2
S(tij |Xij )−σ − (ni − 1)
σ
= , (4.2)
j=1
2 2 − 1
S(t1 , t2 ) = S1 (t1 )−σ + S2 (t2 )−σ − 1 σ2
, (4.3)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
140 Frailty Models in Survival Analysis
where we drop the cluster index i in the following for ease of presentation
but assume a single binary covariate X representing two strata under the
assumption of strata-specific baseline hazard functions. This results in two
different marginal survival functions for the partners in each pair. In this
situation it holds that S(t|X) = S1 (t) if X = 1 (strata 1) and S(t|X) = S2 (t)
if X = 2 (strata 2).
The correlations between lifetimes of randomly selected pairs are always
the same in the shared frailty model, implying a symmetric situation. This
symmetry (which is equivalent to the compound symmetry assumption in
linear mixed models) makes the model less useful for modeling correlations
in family studies with groups of different relatives (mother–father, mother–
daughter, grandfather–son, brother–brother, etc.), which are, for example, of
special interest in genetic studies. In the following we will distinguish between
the parametric and the semiparametric shared gamma frailty models.
n Z ni
∞Y
Y ′ δij β ′ Xij
L(β, θ, σ 2 ) = zi µ0 (tij ; θ)eβ Xij e−zi M0 (tij ;θ)e f (zi ; σ 2 )dzi
i=1 0 j=1
with
zi
1/σ2 −1
zi2 e− σ 2
f (zi ; σ ) = 2/σ
σ 2 Γ(1/σ 2 )
denoting the density function of a gamma distribution with expectation one
and variance σ 2 . A closed form solution to this integral exists. Using the
simplification
ni
X ′
yi = 1/σ 2 + M0 (tij ; θ)eβ Xij
j=1
the foregoing likelihood can be written in the form
β ′ Xij δij
n Qni
∞
j=1 µ0 (tij ; θ)e
Z
Y 2
2
L(β, θ, σ ) = 1/σ2 +di 2/σ2
(yi zi )1/σ +di −1 −yi zi
e yi d(zi ),
i=1 yi σ Γ(1/σ 2 ) 0
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 141
Pni
where the expression di = j=1 δij denotes the number of observed events in
cluster i (i = 1, . . . , n). Now the integral can be solved giving
n Qni ′ δij
2
Y Γ(1/σ 2 + di ) j=1 µ0 (tij ; θ)eβ Xij
L(β, θ, σ ) = 1/σ2 +di .
1/σ 2 + nj=1 σ 2/σ2 Γ(1/σ 2 )
P i ′
i=1 M0 (tij ; θ)eβ Xij
n h
X
log L(β, θ, σ 2 ) = di log σ 2 + log Γ(1/σ 2 + di ) − log Γ(1/σ 2 )
i=1
ni
X ′
2 2
− (1/σ + di ) log(1 + σ M0 (tij ; θ)eβ Xij
)
j=1
ni
X i
+ δij (β′ Xij + log µ0 (tij ; θ)) . (4.4)
j=1
In the parametric shared gamma frailty model with observed covariates, the
unobserved frailty Zi (i = 1, . . . , n) in each cluster can be estimated (Nielsen
et al. 1992) by the expression
1/σ̂ 2 + nj=1
P i
δij
Ẑi = n ′ . (4.5)
1/σ̂ 2 + j=1 M0 (tij ; θ̂)eβ̂ Xij
P i
Here σ̂ 2 is the estimate of the frailty variance, θ̂ the vector of the estimated
parameters of the cumulative baseline hazard function, and β̂ the vector of
estimated regression coefficients. This is a natural extension of formula (3.26)
in the univariate case.
Example 4.1
We consider the Halluca study from Example 1.3 and apply three different
parametric shared gamma frailty models with exponential, Gompertz, and
Weibull baseline hazards (Table 4.1). It turns out that the exponential hazard
function is not flexible enough, the Gompertz and Weibull model show a
significant better fit to the data with respect to the likelihood ratio test (the
exponential model is nested in the Gompertz as well as the Weibull model).
The Weibull model shows a slightly better fit compared to the Gompertz
model, but parameter estimates and standard errors are similar in both models
and also closed to the values obtained in the semiparametric shared gamma
frailty model (Table 4.2) considered in the next section.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
142 Frailty Models in Survival Analysis
Table 4.1: Parametric shared gamma frailty models of the
Halluca lung cancer data
parameter Weibull exponential Gompertz
In all three models we see an increase in risk with higher age, ECOG status 3 or
4 (compared to ECOG status 0 – 2), and higher stage of the disease (reference
group is stage I). Females and patients with a non-small cell lung carcinoma
experience a lower risk. The estimates of the regression and frailty parameters
depend on the parametric assumption regarding the baseline hazard. This
limits the applicability of the parametric shared gamma frailty model because
more detailed investigations about the shape of the baseline hazard function
are necessary. In most practical cases, there is no additional information
about the form of the baseline hazard function, which makes the parametric
approach questionable. Analysis is performed using SAS PROC NLMIXED
based on the suggestion by Liu and Yu (2008).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 143
Here the difference from relation (4.5) lies in the expression M̂0 , which is now
a nonparametric estimator of the cumulative baseline hazard (for example,
the Nelson–Aalen estimator). This approach was used by Carvalho et al.
(2003) to investigate the quality of dialysis centers in Brazil represented
by the unobserved frailty values assigned to each center. This underlines
the possibility of using frailty models to rank treatment centers or other
institutions with respect to their estimated frailties.
Example 4.2
We consider the Halluca data again and apply the semiparametric shared
gamma frailty model. Patients in the study are clustered by the center where
the lung carcinoma was diagnosed. Altogether, there were 53 diagnosing
centers representing the clusters. A univariate and a shared gamma frailty
model is applied to the data in Table 4.2, where in the univariate approach
the clustering of the data is not taken into account.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
144 Frailty Models in Survival Analysis
Table 4.2: Univariate and shared gamma frailty
model of the Halluca lung cancer data
parameter univariate frailty shared frailty
In general, the effect of the covariates is weakened (an exception from this
general rule is the covariate type) when applying the shared gamma frailty
model instead of the univariate gamma frailty model. Please note that the
interpretation of the parameter estimates is different in both models. In the
univariate model the estimates of the regression parameters are adjusted for
the unobserved heterogeneity in the population and only conditional given
the same frailty. Frailty variance is interpreted as a measure of unobserved
heterogeneity in the study population. In the shared gamma frailty model
the parameters are adjusted for the correlation in the clusters, and the frailty
variance is interpreted as a measure of the correlation between the lifetimes in
the clusters. Again, the parameter estimates are conditional given the same
frailty (or, which is equivalent in the shared frailty model, conditional on
being from the same cluster). It is important to note these subtle differences
between the models. Unfortunately, in the literature about frailty models,
sometimes this difference is overlooked and a clear distinction between these
two classes of models is missed, leading to wrong interpretations of the results.
In both models, estimates are obtained by means of the COXPH function
of the statistical package R, which does not provide estimates for the standard
error of frailty variance.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 145
Li (θ) = δ1 δ2 1 − S1 (t1 ; θ) − S2 (t2 ; θ) + S(t1 , t2 ; θ)
+ δ1 (1 − δ2 ) S2 (t2 ; θ) − S(t1 , t2 ; θ)
+ (1 − δ1 )δ2 S1 (t1 ; θ) − S(t1 , t2 ; θ)
+ (1 − δ1 )(1 − δ2 )S(t1 , t2 ; θ)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
146 Frailty Models in Survival Analysis
performed based on the bivariate shared gamma frailty model under different
censoring assumptions.
Data are generated following the bivariate shared gamma frailty model with
frailty variance V(Z) = σ 2 = 1, expectation EZ = 1, and Gompertz baseline.
One thousand samples of sample size 1000 are generated. We consider two
scenarios as follows. In the first scenario, the censoring/monitoring times
are assumed to be independent. Consequently, current status data contain
information about the two-dimensional distribution of the event times. In the
second scenario, both censoring/monitoring times are assumed to be equal.
This is a typical situation in current status data, where blood samples are
used to test for the presence of different infections.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 147
Table 4.4: Simulation (Scenario 2) of the shared
gamma frailty model with current status data
parameter true current status right censored
Example 4.3
The hepatitis A and B current status data from Example 1.7 are analyzed in a
bivariate model. In the second column of Table 4.5, the results from a shared
gamma frailty model, and in the third column from the univariate frailty
model (compare Table 3.9), are presented. The univariate frailty models treat
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
148 Frailty Models in Survival Analysis
Example 4.4
Analogous to the last section with gamma-distributed frailty, the Halluca
data set is analyzed, starting with parametric frailty models based on three
different baseline hazard functions.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 149
Example 4.5
We perform a semiparametric analysis similar to Example 4.2 but with log-
normal-distributed frailty. The results given in Table 4.7 are very close to the
estimates obtained in the semiparametric shared gamma frailty model. This
underlines the robustness of the method against misspecification of the
The shared log-normal model was applied by McGilchrist and Aisbett (1991),
McGilchrist (1993), Gustafson (1997), and Bellamy et al. (2004). In the latter
paper, the case of interval-censored data is considered with underlying Weibull
baseline hazard function. The shared log-normal frailty model was extended
to allow for heterogeneity in the frailty distribution (dispersed frailty) by Lee
and Lee (2003).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
150 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 151
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
152 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 153
This model allows for a stronger association between the lifetimes of the twins
compared to the association between a twin and the third sibling in the family.
Such kind of nonsymmetric correlation structure is considered in more detail
in Chapter 5.
A shared frailty model with log-skew-t distribution of the frailty (including
the log-normal distribution along with many other heavy-tailed distributions,
such as log-Cauchy or log t-distribution) is considered by Sahu and Dey (2004).
A shared gamma frailty model which is based on a Box–Cox transformation
is investigated by Yin and Ibrahim (2005).
A completely different approach to bivariate frailty modeling was used by
Bandyopadhyay and Basu (1990), and Gupta and Gupta (1990). The key
idea of their (more specific) model is a bivariate hazard model in the form
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
154 Frailty Models in Survival Analysis
Then, the uncensored data is given by (T11 , T12 ), (T21 , T22 ), . . . , (Tn1 , Tn2 ),
where the first index denotes the cluster (patient), and the second index
denotes the eye. Here τ is defined as
τ = E sign{(Ti1 − Tj1 )(Ti2 − Tj2 )} , (4.9)
with sign{t} equal to -1 for negative values, 1 for positive values, and 0 if t = 0
(i, j = 1, . . . , n; i 6= j). Another formula is based on the Laplace transform:
Z
τ =4 sL′′ (s)L(s) ds − 1. (4.10)
To keep the notation simple, here and in the following all integrals are defined
over the interval [0, ∞) because lifetimes as well as frailties are nonnegative
variables. In the next paragraph we will derive relation (4.10). Starting from
definition (4.9), the conditional probability Ti1 < Tj1 (i, j = 1, . . . , n; i 6= j)
given the frailty vector (Zi , Zj ) is calculated
RR
ti <tj
f (ti , tj , Zi , Zj ) dti dtj
P(Ti1 < Tj1 |Zi , Zj ) =
f (Zi , Zj )
RR
ti <tj
f (ti , Zj )f (ti , Zj ) dti dtj
=
f (Zi )f (Zj )
R
f (t, Z1 )S(t, Z2 ) dt
=
f (Z1 )f (Z2 )
Z
= f (t|Zi )S(t|Zj ) dt
Z
= µ(t|Zi )S(t|Zi )S(t|Zj ) dt
Z
= Zi µ0 (t)e−(Zi +Zj )M0 (t) dt
Zi
= . (4.11)
Zi + Zj
For the last relation it is necessary that M0 (t) be a nondegenerated cumulative
hazard converging toward infinity as time approaches infinity. Hence, the
difference of the event times Ti1 − Tj1 is negative with probability ZiZ+Z i
j
,
j Z
and positive with probability Zi +Z j
(conditional on (Zi , Zj )). Because the
function sign{·} can take on only the two values minus and plus one (because
of the condition i 6= j zero can only occur with probability zero), it holds that
Zj − Zi
E (sign{(Ti1 − Tj1 )})|Zi , Zj = ,
Zi + Zj
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 155
Zj − Zi
E (sign{(Ti2 − Tj2 )})|Zi , Zj = .
Zi + Zj
τ = E sign{(Ti1 − Tj1 )(Ti2 − Tj2 )}
= E E (sign{(Ti1 − Tj1 )})|Zi , Zj E (sign{(Ti2 − Tj2 )})|Zi , Zj
Z − Z 2
j i
=E .
Zi + Zj
R∞
In the following we consider the integral 0
se−sx ds. Using integration by
parts it holds that
Z
s ∞ 1 Z
se−sx ds = − e−sx + e−sx ds
x 0 x
1
Z
= e−sx ds
x
−1 ∞
= 2 e−sx
x 0
1
= 2.
x
Applying this relation yields
Zj − Zi 2
Z Z Z
τ =E = se−s(zi +zj ) (zj − zi )2 f (zi )f (zj ) dzj dzi ds.
Zi + Zj
Using
Z Z
e−szj (zj − zi )2 f (zj ) dzj = zj2 e−szj f (zj ) dzj
Z
− 2zi zj e−szj f (zj ) dzj
Z
+ zi2 e−szj f (zj ) dzj
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
156 Frailty Models in Survival Analysis
we obtain
Z Z Z
τ = se−s(zi +zj ) (zj − zi )2 f (zi )f (zj ) dzj dzi ds
Z Z
= sL′′ (s)e−szi f (zi ) dzi ds
Z Z
+2 sL′ (s)zi e−szi f (zi ) dzi ds
Z Z
+ sL(s)zi2 e−szi f (zi ) dzi ds
Z Z Z Z
= sL′′ (s)L(s) ds − 2 s(L′ (s))2 ds + sL(s)L′′ (s) ds
Z Z
= 2 sL′′ (s)L(s) ds − 2 s(L′ (s))2 ds. (4.12)
Z ∞ Z
sL′′ (s)L(s) ds = L′ (s)sL(s) − L′ (s)(L(s) + sL′ (s)) ds
0
Z Z
= − s(L (s)) ds − L′ (s)L(s) ds.
′ 2
(4.13)
Z ∞ Z
L′ (s)L(s) ds = (L(s))2 − L′ (s)L(s) ds,
0
resulting in
Z
2 L′ (s)L(s) ds = −1. (4.14)
1
Z Z
s(L′ (s))2 ds = − sL′′ (s)L(s) ds + .
2
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 157
Z
2
= (1 + σ 2 ) s(1 + σ 2 s)− σ2 −2 ds
1 + σ2
Z
2
= 2
(1 + σ 2 s)− σ2 −1 ds
2+σ
1 + σ2 2 − σ22
∞
= (1 + σ s)
−2(2 + σ 2 )
0
1 + σ2
= .
2(2 + σ 2 )
Consequently,
σ2
Z
τ =4 sL′′ (s)L(s) ds − 1 = .
σ2 + 2
For the shared positive stable frailty model, Kendall’s τ can be also easily
calculated using (3.41)
Z Z
γ γ
sL′′ (s)L(s) ds = −γ(γ − 1)sγ−1 e−2s + γ 2 s2γ−1 e−2s ds. (4.15)
γ−1
Z
γ
−γ(γ − 1)sγ−1 e−2s ds = − ,
2
and using integration by parts it holds for the second expression of (4.15) that
1 γ −2sγ ∞ 1 2 γ
Z Z
2 2γ−1 −2sγ γ
γ s e ds = − γs e + γ sγ−1 e−2s = .
2 0 2 4
Finally,
Z
τ =4 sL′′ (s)L(s) ds − 1
γ − 1 γ
=4 − + −1
2 4
= 1 − γ.
Unfortunately, for most other frailty distributions, Kendall’s τ does not have
such an explicit and simple form.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
158 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Shared Frailty Models 159
grandsons, and grandfathers and their sons are different. The use of other
association measure such as Kendall’s τ does not really help in this situation
because it is a function of the frailty variance.
This last, and maybe most important, limitation of shared frailty models is
a consequence of identifiability of the univariate frailty model with observed
covariates. Hence, it is an inherent feature to all shared frailty models with
a finite mean of frailty distribution. To overcome this problem, Hougaard
(1986a, 1987) suggests the shared positive stable frailty model. In this case
the univariate model with observed covariates is not identifiable because the
mean of the positive stable distribution is infinite. So one can expect more
flexibility from a shared frailty model with positive stable distribution than
from models with gamma frailty. The bivariate survival function in the shared
positive stable frailty model is given by (4.7)
h 1/γ 1/γ γ i
S(t1 , t2 ) = exp − ln S1 (t1 ) + − ln S2 (t2 ) . (4.16)
When single covariates Xi (i = 1, 2) are observed, and the conditional hazard
is (3.2), then the univariate survival functions are
γ ∗
(t)eγβi Xi = exp − Mi∗ (t)eβi Xi ,
Si (t) = exp − M0i
γ
where βi∗ = γβi and Mi∗ (t) = M0i (t). Using a shared positive stable frailty
model it is possible to estimate both association parameter γ and regression
coefficients βi from data on related individuals. One problem remains yet
unsolved: the interpretation of regression parameters βi . To illustrate this
problem, let β = β1 = β2 for two groups of relatives, for example, MZ and
DZ twins. It is clear that the association parameter γ is different for MZ and
DZ twins. Hence the values of parameter β ∗ = γβ and M ∗ (t) in equation
(4.16) are also different for MZ and DZ twins, which contradicts the natural
assumption that the survival of these individuals conditioned on observed
covariates follow the same survival model. If the parameters β ∗ = γβ are
assumed to be the same for MZ and DZ twins, then the parameters β and
the baseline hazards µ0 (t) should be different for these pairs of individuals,
which creates a problem for the interpretation of the conditional hazard (3.2)
for this model.
The problem can be described in a more general way. The main feature of
the shared frailty approach is its symmetry. In multicenter clinical trials it
is reasonable to assume a symmetric relationship between all possible pairs
of patients in a study center because patients in a center are exchangeable.
It makes no sense to assume a nonsymmetric correlation structure inside the
centers. The situation changes dramatically when considering family studies
with relatives of different relationship. Applying a shared frailty model would
imply the same relationship (correlation) between any pairs of relatives in
the family. This is, of course, not a reasonable assumption. Furthermore, it
contradicts the assumption of different relationships in families, which is the
basis for genetic studies.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
160 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Chapter 5
Correlated Frailty Models
In the last chapter we focused on the shared frailty model as a way of modeling
associations between event times. The concept of shared frailty goes back to
Clayton (1978), and was extensively studied in books by Hougaard (2000),
and Duchateau and Janssen (2008). The shared frailty approach has proven
to be a useful and popular extension of the Cox model when observations
from subjects are not statistically independent of each other. As discussed in
Section 4.9, the shared frailty model is especially useful for clustered event
time data when the correlation structure in the clusters is symmetric and not
of special interest as, for example, in multicenter clinical trials.
Similar to the univariate and shared frailty model, the correlated frailty
model is a mixture model because frailty for each individual is assumed to be
random. This is also known as the concept of random hazards. The model is
based on the assumption that event times in a cluster are independent, given
the vector of frailties. In other words, it is a conditional independence model
where the frailty variables are correlated but not necessarily common for all
individuals in a cluster, and therefore, responsible for dependence between
event times. The shared frailty model is a special case of the correlated
frailty model with correlations between the frailties equal to one. A correlated
frailty model can be considered as a mixed (random effects) model in survival
analysis, with group and individual variation both included in the distribution
of the frailty vector. Therefore, correlated frailty models contain association
characteristics of frailty (correlation coefficients) among other parameters,
which makes them especially convenient for genetic studies of relatives with
event-time outcome. In particular, questions about the role of genes and
environment in susceptibility to diseases and death can be addressed. Here,
the correlation between frailties is of key interest. This is different from many
other applications where correlation is treated as a nuisance to be adjusted
for in the model, and not of special interest. Consequently, we will focus,
the applications of the correlated frailty model, mainly on twin and family
event-time data. In such kinds of applications, the researcher is usually faced
with small cluster sizes. Special focus is on the bivariate case.
It is assumed that there is independence between observations from different
clusters. If variances of the frailty variables are zero, this implies independence
between event times in the clusters. The structure of the model makes them
convenient for application of the EM algorithm, allowing for semiparametric
161
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
162 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 163
S(t1 , t2 |Z1 , Z2 ) = S1 (t1 |Z1 )S2 (t2 |Z2 ) = e−Z1 M01 (t1 ) e−Z2 M02 (t2 ) ,
where Z1 and Z2 are two correlated frailties. The distribution of the random
vector (Z1 , Z2 ) needs to be specified and determines the association structure
of the event times in the model.
Consider some bivariate event times – for example, the lifetimes of twins, or
age at onset of a disease in spouses, time to blindness in the left and right eye,
or time to failure in the left and right kidney of patients. In the (bivariate)
correlated frailty model, the frailty of each individual in a pair is defined by
a measure of relative risk, that is, exactly as it was defined in the univariate
case. For two individuals in a pair, frailties are not necessarily the same, as
they are in the shared frailty model. We are assuming that the frailties are
acting multiplicatively on the baseline hazard function (proportional hazards
model) and that the observations in a pair are conditionally independent,
given the frailties. Hence, the hazard of the individual j (j = 1, 2) in pair
i (i = 1, . . . , n) has the form
′
µ(t|Xij , Zij ) = Zij µ0j (t)eβ Xij , (5.1)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
164 Frailty Models in Survival Analysis
β ′ Xij
S(t|Xij , Zij ) = eZij M0j (t)e ,
with M0j (t) denoting the cumulative baseline hazard function. Here and
in the following, S is used as a generic symbol for a survival function. The
contribution of individual j (j = 1, 2) in pair i (i = 1, . . . , n) to the conditional
likelihood is given by
′
δij β ′ Xij
Zij µ0j (tij )eβ Xij eZij M0j (tij )e ,
where tij stands for observation time of individual j from pair i. Assuming
the conditional independence of lifespans, given the frailty, and integrating
out the frailty, we obtain the marginal likelihood function
n ZZ δi1
Y ′ β ′ Xi1
zi1 µ01 (ti1 )eβ Xi1 ezi1 M01 (ti1 )e (5.2)
i=1
R+ ×R+
′
δi2 β ′ Xi2
× zi2 µ02 (ti2 )eβ Xi2 ezi2 M02 (ti2 )e f (zi1 , zi2 ) dzi1 dzi2 ,
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 165
λ0
Z1 = Y0 + Y1 ∼ Γ(k0 + k1 , λ1 ) (5.3)
λ1
λ0
Z2 = Y0 + Y2 ∼ Γ(k0 + k2 , λ2 ) (5.4)
λ2
1 1
and EZ1 = EZ2 = 1, V(Z1 ) = λ1 := σ12 , V(Z2 ) = λ2 := σ22 . Then the
following relation holds
λ0 λ0
cov(Z1 , Z2 ) = cov( Y0 + Y1 , Y0 + Y2 )
λ1 λ2
λ20
= V(Y0 )
λ1 λ2
λ20 k0
=
λ1 λ2 λ20
k0
=
(k0 + k1 )(k0 + k2 )
cov(Z1 , Z2 ) k0
ρ= p =p .
V(Z1 )V(Z2 ) (k0 + k1 )(k0 + k2 )
1
Consequently, because of the relation k0 + ki = λi = σi2
(i = 1, 2), it holds
σ
1− σ i ρ
ρ 1 j
that k0 = σ1 σ2 and ki = σi2
− k0 = σi2
(i, j = 1, 2; i 6= j). Now we can
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
166 Frailty Models in Survival Analysis
1 λ0 λ0
= (1 + ( M01 (t1 ) + M02 (t2 )))−k0
λ0 λ1 λ2
1 1
× (1 + M01 (t1 ))−k1 (1 + M02 (t2 ))−k2
λ1 λ2
−ρ
= (1 + σ12 M01 (t1 ) + σ22 M02 (t2 )) σ1 σ2
σ1 σ2
−1+ ρ −1+ ρ
σ2 σ1
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 167
of the ith pair in the parametric case without censoring, truncation, and
observed covariates is given by
ρ 2 ρ 2
log Li (θ, σ 2 , ρ) = ln(S1 (ti1 ; θ)−σ1 ) + ln(S2 (ti2 ; θ)−σ2 )
σ1 σ2 σ1 σ2
ρ 2 2
−( + 2) ln(S1 (ti1 ; θ)−σ1 + S2 (ti2 ; θ)−σ2 − 1)
σ1 σ2
σ1 σ2 σ2 2
+ ln 1 − ρ − ρ + ρ2 + (1 − ρ)S1 (ti1 ; θ)−2σ1
σ2 σ1 σ1
σ1 2
+ (1 − ρ)S2 (ti2 ; θ)−2σ2
σ2
σ1 σ2 2 2
+ (2 − ρ − ρ + σ1 σ2 ρ + ρ2 )S1 (ti1 ; θ)−σ1 S2 (ti2 ; θ)−σ2
σ2 σ1
σ1 σ2 2
+ (−2 + ρ + 2 ρ − ρ2 )S1 (ti1 ; θ)−σ1
σ2 σ1
σ2 σ1 2
+ (−2 + ρ + 2 ρ − ρ2 )S2 (ti2 ; θ)−σ2 .
σ1 σ2
Example 5.1
Age at death by cancer (all types of cancer combined) and age at death by
stroke in twin pairs are the event times of interest. Because of the symmetric
structure of the twin data, we use the simplifications S(t) = S1 (t) = S2 (t)
and σ 2 = σ12 = σ22 . A parametric approach with a Gompertz baseline hazard
µ0 (t) = λeϕt is fitted, resulting in the univariate marginal survival function
λ 1
S(t) = (1 + σ 2 (eϕt − 1))− σ2 .
ϕ
The data is right censored, and left truncated, which has to be included into
the likelihood. The results of the maximum likelihood parameter estimation
procedure are given in Table 5.1.
For each cause of death, a separate analysis of the same data was performed,
treating all other causes of death as independent censored observations. To
allow a combined analysis of monozygotic (MZ) and dizygotic (DZ) twins, we
include two correlation coefficients into the model, ρMZ and ρDZ , respectively.
These correlations between MZ and DZ twins provide important information
about genetic and environmental influences on frailty (susceptibility) within
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
168 Frailty Models in Survival Analysis
Table 5.1: Parameter estimates in the correlated gamma frailty model
applied to Danish twin pairs
males females
cancer stroke cancer stroke
individuals (see Appendix A.6). For both causes of death and both sexes, the
estimates of correlation of frailty for MZ twins (ρMZ ) are higher than that for
DZ twins (ρDZ ). These higher correlations in MZ twins compared to DZ twins
indicate the importance of genetic factors involved in susceptibility (frailty) to
cause-specific mortality. Estimates of correlations range between 0.08 (cancer,
DZ female twins) and 0.71 (stroke, MZ male twins). The differences in the
estimates of the frailty variance σ 2 indicate different levels of heterogeneity.
The shared gamma frailty model ρ = 1 is sometimes applied to twin data, but
results in different estimates of σ 2 for MZ and DZ twins (Hougaard 2000).
This contradicts the observation that univariate (marginal) lifetimes of MZ
and DZ twins are similar.
Parameter estimates were obtained using a selfwritten GAUSS routine. The
routine is based on constrained maximum likelihood estimation, maximizing
the parametric likelihood function under user-specified constraints on the
parameters. The likelihood function for left-truncated and right-censored
bivariate data is given in Appendix A.1 and A.2. We will reanalyze the data
again in the following example with the emphasis on obtaining quantitative
results about the heritability of cause-specific mortality.
Example 5.2
This example deals with an extension of the above correlated gamma frailty
model which allows the integration of genetic models (see Appendix A.6).
In this approach, the correlations between the frailties of family members are
substituted by their respective variance components. In the present case of MZ
and DZ twins, relation (A.8) can be used. In ACE models containing additive
genetic factors (A), common environment (C), and unique environment (E),
the corresponding variance components are denoted by a2 , c2 , and e2 . Using
this reparameterization of the model, the data from Example 5.1 are analyzed
again in Table 5.2. The main interest is in the estimation of the heritability
(a2 ), which quantifies the importance of genetic factors on a trait. Here,
the trait of interest is susceptibility to death caused by cancer and stroke,
respectively. Again, a Gompertz baseline hazard is used.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 169
Table 5.2: Heritability of cause of death. Parameter estimates in the
correlated gamma frailty model applied to Danish twin pairs
males females
cancer stroke cancer stroke
Heritability estimates differ by gender and cause of death, ranging from 0.21
(cancer in females) to 0.63 (stroke in males). At least moderate heritability
is usually the basis for further linkage and association studies to identify risk
genes.
This approach from event history analysis permits accounting for censoring
and truncation present in the data. The method is combined with variance
component models from genetic epidemiology. For such a combined analysis,
we apply the correlated gamma frailty model, which takes into account the
dependence of lifespans of relatives. This enables the estimation of the effect
of genetic factors in susceptibility to cancer and stroke. The approach allows
the combination of data on age of death and cause of death and for overcoming
the drawbacks of the traditional concordance analysis usually applied in twin
studies with time-to-event data. For each twin partner two competing risks of
latent times (with respect to death due to cancer or stroke, and with respect to
death due to all other diseases including censoring) are modeled. In addition,
we assume that these competing risks are independent.
ZF = V1 + V2 ZM = V3 + V4 ZC = V1 + V3 .
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
170 Frailty Models in Survival Analysis
1
1 0
1 2
1
V(ZF , ZM , ZC ) = 0 1 2
.
λ 1 1
2 2 1
The correlation between the frailty of the father ZF or the mother ZF and
frailty of the child ZC is one half due to the fact that the two individuals share
half of their genes. The frailties of the father and mother in each family are
assumed to be independent. It is assumed that the baseline hazard functions
were unknown (semiparametric model). In the bivariate cases (father – child
or mother – child), the genetic frailty model is a correlated frailty model with
correlation equals to one half. This can be seen from the trivariate survival
function of the model (with reparameterization σ 2 = 1/λ)
The first equation reflects the fact that the frailties of the parents are assumed
to be uncorrelated. This results in independent event times. The second and
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 171
σ1 σ2
S1 (t1 |X1 )1− σ2 ρ S2 (t2 |X2 )1− σ1 ρ
S(t1 , t2 |X1 , X2 ) = 2 2
ρ ,
(S1 (t1 |X1 )−σ1 + S2 (t2 |X2 )−σ2 − 1) σ1 σ2
where X1 and X2 are the covariate vectors of the partners in the pair. Again,
the marginal survival functions are assumed to be identical in the following
applications to twin data by S(t|X) = S1 (t|X) = S2 (t|X) and σ 2 = σ12 = σ22
yielding:
′
− 12
S(t|X) = 1 + σ 2 M0 (t)eβ X
σ
. (5.7)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
172 Frailty Models in Survival Analysis
Example 5.3
In the following we present an application of the bivariate correlated gamma
frailty model with observed covariates to the lifetimes of MZ and DZ Danish
twins with respect to coronary heart disease (CHD). We analyze the influence
of smoking, body mass index (BMI), gender, and birth year (using the variable
transformation birth year minus 1890 because the oldest twins are born in
1890) on the susceptibility to death by CHD.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 173
The bivariate correlated gamma frailty model can be extended to the multi-
variate case with p related lifespans, which results in the gamma-distributed
frailty case in the multivariate survival times
p p
Y
1−ρ
X 2 −ρ/σ2
S(t1 , . . . , tp ) = Si (ti ) Si (ti )−σ − p + 1 .
i=1 i=1
In this model, identical distributions are used for the frailties of the subjects
in a cluster, and the correlations between frailties in a cluster are unique and
given by ρ. Further extensions of this model with different frailty variances
can be found in Yashin and Iachine (1999a), but need additional requirements,
which are rather restrictive for real applications. A major disadvantage of the
model is that it becomes very complex with increasing cluster size. In the fore-
going twin example the likelihood consists of four terms caused by the different
possibilities for censoring plus one term for truncation (see Appendix (A.2)).
With cluster size three, the number of terms increases to nine; with cluster
size four, 17 terms have to be calculated. These terms include derivatives of
the survival function of the order of the cluster size. This limits multivariate
extensions of the correlated gamma frailty model to cluster sizes of three or
at most four. This inflexibility was the main reason why this model was
not considered further in the literature, for example, to model family data.
For such kind of applications (cluster size larger than two), the correlated
log-normal frailty model considered in the following chapter is much more
appropriate because of the flexibility of multivariate normal distribution.
In the applications of the correlated gamma frailty model presented here, a
parametric approach is used. Similar to shared frailty models, semiparametric
approaches are sometimes preferable to prevent assumptions about the form
of the baseline hazard function. For the bivariate correlated gamma frailty
model, an EM algorithm for a semiparametric approach was suggested by
Iachine (1995), but software is not publicly available for this model. Further
developments in this direction are restricted by the problems with extensions
of the bivariate correlated gamma frailty model to the larger cluster sizes
discussed earlier. The main problem in comparison with the shared frailty
model is that the number of frailties per cluster to be estimated in the EM
algorithm is determined by the number of observations in the cluster. This
causes severe problems with larger cluster size.
Parner (1998) extends results from Murphy (1994,1995) from the shared
gama frailty model and shows consistency and asymptotic normality of the
nonparametric maximum likelihood estimator in the multivariate correlated
gamma frailty model with observed covariates.
The correlated gamma frailty model can also be applied to current status
data. In the case of bivariate event times the likelihood function is given by
(4.7). Information about identifiability, consistency, and asymptotic normality
in the correlated gamma frailty model with current status data is given in the
paper by Chang et al. (2007). In the following results of a small simulation
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
174 Frailty Models in Survival Analysis
Each data set is analyzed under two different censoring mechanisms – once
as right-censored lifetimes, and once as current status data. The correlated
gamma frailty model is much more complex than the shared gamma frailty
model because of the additional correlation parameter ρ in the model. In this
model the parameter estimates based on right-censored data are much closer
to the true parameter values and with much smaller standard errors compared
to the estimates based on the current status data. Especially, the parameter
estimates for ϕ1 , ϕ2 are strongly biased upward with the current status data,
as are the estimates of σ 2 and ρ. Consequently, switching from right-censored
lifetime data to current status data causes an important information loss.
However, further simulations with increasing sample size (results are not
shown) imply that parameters can be estimated consistently based on event-
time data as well as current status data.
Scenario 2: Both censoring/monitoring times are equal.
We consider this situation in detail here because it holds for the real data
example considered later. Correlation coefficients of ρ = 0.25, 0.5, 0.75, and 1
are simulated. The results are similar to that of the situation of independent
censoring. For the current status data, the upwards bias in the parameter
estimates of ϕ1 , ϕ2 , σ 2 is more pronounced; also, the standard errors are larger.
Obviously, the upwards bias becomes less pronounced for larger values of the
correlation ρ (see Tables 5.5-5.8). The correlated gamma frailty model with
parametric baseline hazard is identifiable with current status data and gives
better results with larger sample size (results are not shown).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 175
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
176 Frailty Models in Survival Analysis
Example 5.4
Here, the hepatitis A and B current status data from Example 1.7 are analyzed
in detail. In the second column of Table 5.9, a correlated gamma frailty model
with different frailty variances σ12 and σ22 for hepatitis A and B, respectively,
is presented (correlated I). In the third column, a model with equal variances
σ12 = σ22 is used (correlated II). The fourth column contains a shared gamma
frailty model (ρ = 1), and column five contains the univariate frailty model
(ρ = 0). There is no reason to assume per se that the frailty variance with
Table 5.9: Analysis of hepatitis A and B current status data with the
correlated gamma frailty model
parameter correlated I correlated II shared univariate
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 177
s2
µ = EZj = em+ 2
2 2
σ 2 = V(Zj ) = e2m+s (es − 1)
2
ers − 1
ρ = corr(Z1 , Z2 ) = s2 .
e −1
Again, the restriction m = 0 is used to guarantee identifiability of the model
parameters. This means that the random effects Wj (j = 1, 2) (log frailties)
have an expectation of zero.
Xue and Brookmeyer (1996) were the first authors to consider the correlated
log-normal frailty model and applied it to mental health data to evaluate
the health policy effects for inpatient psychiatric care. Yau and McGilchrist
(1997) used a generalized linear mixed model approach to analyze data from
litter-matched tumorigenesis experiments in rats with help of the correlated
log-normal model. Cook et al. (1999) used a correlated log-normal frailty
in two-state mixed renewal processes for chronic diseases. Other examples
can be found in Ripatti and Palmgren (2000) and Ripatti et al. (2002).
Based on correlated log-normal frailty models, Pankratz et al. (2005) perform
genetic analysis of age at onset in breast cancer in a large familial cohort.
Their method is based on a Laplace approximation similar to the approach
by Ripatti and Palmgren (2000).
Example 5.5
The correlated log-normal frailty model is applied to the mortality of Danish
twins. The duration of interest is lifetime with respect to CHD; that means
lifetimes with respect to other causes of death are considered as censoring
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
178 Frailty Models in Survival Analysis
Table 5.10: Correlated log-normal frailty model with
covariates
parameter without covariates with covariates
times. The results in Table 5.10 are difficult to compare with Example 5.3,
where a correlated gamma frailty model was applied. The variances σ 2 of the
gamma and s2 of the log-normal model are not directly comparable. The same
problem holds for the correlations in both models. The analysis indicates a
nonsignificant mortality increase for twins with BMI less than 22 kg/m2 with
β = 0.312 (0.184) and BMI more than 28 kg/m2 with β = 0.149 (0.161),
respectively. The reference group are twins with BMI between 22 kg/m2 and
28 kg/m2 . Cigarette smoking shows a significant influence on CHD mortality
with β = 0.556 (0.192), whereas pipe/cigar smokers β = 0.32 (0.23) and former
smokers β = 0.48 (0.26) experience only a nonsignificant increase in risk. The
reference group are the nonsmokers. Females have a significant better survival
β = −0.966 (0.194) compared to males. The birth year shows no significant
association with CHD death in this sample. Analysis was performed with
SAS NLMIXED.
In twin studies, the main focus is on the correlations between MZ and
DZ twins and not on the covariates. The effect of covariates may be better
evaluated in studies of singletons. Correlations are of special interest for
the quantification of the influence of genetic and environmental factors. A
comparison between models without and with covariates allows conclusions
about the nature of the covariates (genetic vs. environment). The inclusion
of the covariates reduces the frailty variance. This is expected because these
covariates are included in the frailty in the model without covariates. By
explicit modeling as observed covariates in the analysis, these covariates are no
longer included in the frailty (as a proxy for unobserved covariates). It implies
the reduction in the frailty variance from 6.520 (3.263) to 2.394 (1.210). In
contrast, the correlations change only slightly between the two models. This
point is further discussed in the next example.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 179
Example 5.6
In this example, an extension of the correlated log-normal frailty model is
presented, which allows the integration of genetic models (see Appendix A.6).
In this approach the correlations between family members are substituted by
the respective variance components. In the case of MZ and DZ twins relation
(A.8) can be used. In an ACE model including additive genetic factors (A),
common environment (C), and unique environment (E), the corresponding
variance components are denoted by a2 , c2 , and e2 . This reparameterization
is used to analyze the data from Example 5.3 again. The main interest is
in the estimation of the heritability (a2 ), which quantifies the importance of
genetic factors on a trait. Here, the trait of interest is lifetime with respect to
CHD. Again, a Gompertz baseline hazard is used (semiparametric approach).
Former studies from the Danish (Harvald and Hauge 1970) and Swedish (de
Faire 1975, Marenberg et al. 1994) twin registries found a genetic component
in the susceptibility to death from coronary heart disease. The advantage
of the approach presented here is the combination of methods from survival
analysis with methods from genetic epidemiology. This allows the estimation
of the effect of genetic factors in susceptibility to CHD and the evaluation as
to what extent smoking, BMI, and susceptibility to CHD are all influenced by
common genetic factors. For each individual, two independent, underlying,
competing risks of latent times – lifetime with respect to death due to CHD
and lifetime with respect to death due to all other causes including censoring
events – are assumed.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
180 Frailty Models in Survival Analysis
Both, smoking and BMI, are influenced by genes with heritability estimates
in the range 0.35 − 0.75 (smoking) and 0.5 − 0.8 (BMI) (Bouchard 1994,
Heath and Madden 1995, Herskind et al. 1996b). However, whether common
genes influence these phenotypic traits as well as susceptibility to CHD is
an open question. In a different situation Fisher (1958) suggested that the
association between smoking and lung cancer is spurious and reflects only the
circumstance that the same genes influence both smoking habits and lung
cancer. This was the starting point for a long debate on genetic confounding.
The main result of the present analysis is that the inclusion of smoking
and BMI do not cause any substantial decrease in the heritability estimate.
Consequently, no evidence was found for common genetic factors acting on
smoking and susceptibility to CHD or BMI, and susceptibility to CHD. This
study confirms the earlier finding that the genetic influence on susceptibility to
CHD is not mediated through genetic influence on smoking and BMI. Similar
results were found by Zdravkovic et al. (2004) in Swedish twins. The approach
here is different from the common study design in many medical applications.
Usually the investigator is interested in the effect of covariates on the outcome,
adjusted for the correlation in clustered observations, treating the correlation
as a nuisance parameter. In genetic epidemiology, the main interest is in the
correlations, adjusted for the effect of covariates.
As expected, the inclusion of covariates decreases the heterogeneity in the
study population, which can be seen in the decline of the frailty variance from
s2 = 7.875 (3.505) to s2 = 2.500 (1.182). This underlines that frailty depends
on the model, and it describes factors not included in the model.
When covariates are included in the model, the relative importance of
environmental factors is reduced, leading to an increase in the heritability
estimate observed in the present analysis. This is a result of the heritability
coefficient as a variance proportion. The variance of the random effects can
be decomposed as follows: s2 = s2genes +s2environment . Focusing on heritability
only one does not know whether the heritability increases due to an increase
in the genetic variance or due to a decrease in the environmental variance.
In the present case, the increase was largely due to the latter. By including
smoking and BMI, the genetic variance was reduced from s2genes = 2.32 to
s2genes = 1.16. The reduction of the environmental heterogeneity is more
pronounced, for example, from s2environment = 5.55 to s2environment = 1.34.
This result underlines that both factors primarily represent environmental
sources of variation for age at CHD death, despite the role that genetics
may play for the specific factors. A comprehensive discussion of variance
components can be found in Hopper (1993). Genetic confounding would lead
to a decrease in heritability estimates after the inclusion of observed covariates
because, in that case, genetic factors contribute predominantly to the observed
covariates rather than to the unobserved covariates included in the frailty.
Our results are similar to the results obtained in a study of Swedish twins
(Zdravkovic et al. 2004). More details in a slightly different analysis of the
same data are given in Wienke et al. (2005a).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 181
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
182 Frailty Models in Survival Analysis
1. Likelihood function:
n Y
2
Y Wij λ ϕtij
L(λ, ϕ|W) = (eWij λeϕtij )δij e−e ϕ (e −1)
i=1 j=1
2. Priors:
3. Hyperpriors:
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 183
with W = (W11 , W12 , W21 , W22 , ..., Wn1 , Wn2 ). Γ and U denote the gamma
and uniform distribution, respectively, and λ and ϕ are parameters of the
Gompertz baseline hazard. The prior (i), assigned to the vector (Wi1 , Wi2 ), is
chosen in order to have a vector of log-normal frailties (Zi1 , Zi2 ) = (eWi1 , eWi2 )
with mean equal to one:
2
σ ρσ 2
Zi1 1
∼ ln N , .
Zi2 1 ρσ 2 σ 2
Example 5.7
The results of applying the correlated log-normal frailty model to the Swedish
breast cancer data using MCMC methods as described before are presented
in Table 5.12. The study population consists of 12,568 female twin pairs
of the old and middle cohort. Estimated parameters include the Gompertz
parameters λ and ϕ and the variance of the frailty distribution σ 2 , which
can be seen as the extent of population heterogeneity with respect to age at
onset of breast cancer, and estimates of the correlation coefficient for both
MZ twins (ρMZ ) and DZ twins (ρDZ ). Two estimates for each parameter
are given in terms of the mean and the median of the correspondent Markov
chain. In all cases, both values are very close to each other. This means
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
184 Frailty Models in Survival Analysis
Table 5.12: Correlated log-normal frailty model
parameter mean median sdv MC error CSRF
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 185
Z1 = Y0 + Y1 ∼ cP (γ, k0 + k1 , λ)
Z2 = Y0 + Y2 ∼ cP (γ, k0 + k1 , λ).
For simplicity, we shall present only the symmetric case here, where the two
lifetimes are interchangeable. Consequently, the following assumptions are
made in the model:
1−γ
(k0 + k1 )λγ = λ = . (5.8)
σ2
It holds
cov(Z1 , Z2 )
ρ= p
V(Z1 )V(Z2 )
k0 (1 − γ)λγ−2
=
(k0 + k1 )(1 − γ)λγ−2
k0
= . (5.9)
k0 + k1
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
186 Frailty Models in Survival Analysis
k0 1−γ
k0 λγ = (k0 + k1 )λγ = ρ 2 . (5.10)
k0 + k1 σ
Now we can derive the unconditional model, applying the Laplace transform
of compound Poisson-distributed random variables (3.53). Hence,
The three terms are considered in detail. For the marginal survival function,
it holds that
k0 +k1
((λ+M0 (t))γ −λγ )
S(t) = e− γ , (5.12)
which implies
γ
λ + M0 (t) = (λγ − ln S(t))1/γ . (5.13)
k0 + k1
Hence, using (5.9) and (5.12),
k1 k1 k0 +k1
((λ+M0 (t))γ −λγ ) ((λ+M0 (t))γ −λγ )
e− γ = e− k0 +k1 γ
k +k γ γ
−(1−ρ) 0 γ 1
((λ+M0 (t)) −λ )
=e
k0 +k1 γ γ 1−ρ
= e− γ ((λ+M0 (t)) −λ )
= S(t)1−ρ .
The first of the three terms in (5.11) can be rewritten because of (5.13):
k0
((λ+M0 (t1 )+M0 (t2 ))γ −λγ )
e− γ
k0
((λ+M0 (t1 )+λ+M0 (t2 )−λ)γ −λγ )
= e− γ
k0 γ γ
(((λγ − k ln S(t1 ))1/γ +(λγ − k ln S(t2 ))1/γ −λ)γ −λγ )
= e− γ 0 +k1 0 +k1
k0 λγ γ γ
(1−((1− (k γ ln S(t1 ))1/γ +(1− (k γ ln S(t2 ))1/γ −1)γ )
=e γ 0 +k1 )λ 0 +k1 )λ ,
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 187
which results because of (5.8) and (5.10) in the following representation of the
correlated model
Example 5.8
The model is applied to breast cancer incidence of MZ and DZ female Swedish
twins born 1886 – 1925 (Wienke et al. 2006a, 2010), described in detail in
Example 1.6. A parametric model with Gompertz baseline hazard function
is used with parameters λ and ϕ. The data is left truncated, because only
breast cancer cases are included that occurred after the foundation of the
cancer register. This truncation is adjusted for in the analysis. The results
are given in Table 5.13.
The model in the first column is the correlated gamma frailty model (5.2).
Parameter σ 2 is a measure of heterogeneity, which is estimated to be large in
this data, and all individuals of the population are assumed to be susceptible
to breast cancer. The correlated gamma frailty model is a special case of the
correlated compound Poisson/PVF frailty model with γ = 0.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
188 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 189
S(t1 , t2 ) (5.15)
1
=p
1 + 2s21 M0 (t1 ) + 2s22 M0 (t2 ) + 4(1 − r2 )s21 s22 M0 (t1 )M0 (t2 )
n m21 M0 (t1 )(1 + 2s22 M0 (t2 )) + m22 M0 (t2 )(1 + 2s21 M0 (t1 )) o
× exp −
1 + 2s21 M0 (t1 ) + 2s22 M0 (t2 ) + 4(1 − r2 )s21 s22 M0 (t1 )M0 (t2 )
n 4rm1 m2 s1 s2 M0 (t1 )M0 (t2 ) o
× exp .
1 + 2s21 M0 (t1 ) + 2s22 M0 (t2 ) + 4(1 − r2 )s21 s22 M0 (t1 )M0 (t2 )
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
190 Frailty Models in Survival Analysis
We are now interested in the correlation between the frailties Z1 = W12 and
Z2 = W22 . First, we need the mixed second moment EW12 W22 . To calculate
this moment, we rewrite the density of the two-dimensional normal distribu-
tion of the random effects:
EW12 W22
w12 w22
w1 w2
1
ZZ 1
− −r +
w12 w22 e (1−r ) 2s2 2s2
2 s1 s2
= √ 1 2 dw1 dw2
2πs1 s2 1 − r2
2
w1
w22
rw2
1
ZZ 1
− 2(1−r −2 s w1 −
w12 w22 e
2) s2 1 s2 2(1−r2 )s2
= √ 1 e 2 dw1 dw2
2πs1 s2 1 − r2
2
w1
w22
rw2
1
Z Z 1
2 − −2 s w1 −
w12 e 2(1−r ) s2 2(1−r2 )s2
2 1 s2
= w2 √ 1 dw1 e 2 dw2
2πs1 s2 1 − r 2
Z w22
−
= w22 I(w2 )e 2(1−r )s2 dw2
2 2
(5.16)
with 2
w1 rw2
1
Z 1
− −2 s w1
w12 e 2(1−r )
2 s2 1 s2
I(w2 ) = √ 1 dw1 .
2πs1 s2 1 − r2
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 191
In the following, we first calculate the inner integral I(w2 ) and then the outer
integral in a second step:
Z∞ 2
w1 rw2
1 − 1
−2 s w1
I(w2 ) = √ √ w12 e 2(1−r2 ) s2
1 1 s2
dw1
2πs1 1 − r2
−∞
Z∞ 2 −2r s1 w w
w1
s2 2 1
1 −
= √ √ w12 e 2(1−r2 )s2
1 dw1
2πs1 1 − r2
−∞
s2
Z∞ s
(w1 −r 1 w2 )2 −r2 1 w2
2
s2 s2
1 − − 2
= √ √ w12 e 2(1−r2 )s2
1 dw1 e 2(1−r2 )s2
1
2πs1 1 − r2
−∞
2 2 2
s21 2 − 2(1−r
−r s1 w2
= (1 − r2 )s21 + r2 w e 2 )s2 s2
1 2 .
s22 2
Here, the last equation holds because the integral is the expectation of a
random variable W 2 , with W following a normal distribution with expectation
r ss12 w2 and variance (1 − r2 )s21 . Now we are solving the second integral with
respect to w2
Z∞ 2 −r2 s2 2
(1 − r2 )s21 −
w2
− 1 w2
EW12 W22 = √ w22 e 2(1−r2 )s2
2 e 2(1−r2 )s2 s2
1 2 dw2
2πs2
−∞
2
s
r2 s12 Z∞ w22
−
+ √ 2
w24 e 2(1−r2 )s2
2 dw2
2πs2
−∞
Z∞ s2 ∞
(1 − r2 )s21 − 22
w2 r2 s12 Z w22
4 − 2s22
= √ w22 e 2s2 dw2 + √ 2
w2 e dw2
2πs2 2πs2
−∞ −∞
s2
= (1 − r2 )s21 EW22 + r2 12 EW24
s2
2
s
= (1 − r2 )s21 s22 + r2 21 3s42
s2
= (1 + 2r2 )s21 s22 .
Using the relations V(Wj2 ) = EWj4 − (EWj2 )2 = 3s4j − s4j = 2s4j (j = 1, 2), we
can now calculate the correlation between Z1 = W12 and Z2 = W22
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
192 Frailty Models in Survival Analysis
1
S(t1 , t2 ) = p
1+ 2s21 M0 (t1 ) + 2s22 M0 (t2 ) + 4(1 − r2 )s21 s22 M0 (t1 )M0 (t2 )
results in
1
S(t1 , t2 ) = q . (5.17)
S1−2 (t1 )S2−2 (t2 ) − r2 (S1−2 (t1 ) − 1)(S2−2 (t2 ) − 1)
This is a very interesting relation because the bivariate copula depends only
on the correlation parameter, and not on the variances of the frailties. This
is different compared to the correlated gamma and compound Poisson frailty
model. From the interpretational point of view, this is an advantage of the
model because only the correlation parameter influences the bivariate copula,
whereas the frailty variances as heterogeneity parameters influence only the
marginal distributions.
The model contains an interesting special case. Consider the case of a
shared quadratic hazard frailty model (r = 1, s2 = s21 = s22 ). In that case, the
bivariate copula representation is of the form
This is the copula representation in the shared gamma frailty model with
σ 2 = 2 (see (4.3)). Both frailty models are equal if s2 = 2 holds. In general,
the marginal survival functions in (5.18) depend on s2 , a parameter that can
vary freely and influences the marginal distributions. Interestingly, in this
shared quadratic hazard frailty model, the copula representation of the life-
times is independent of the variance of the random effects s2 . This underlines
the fact that it is misleading to use the frailty variance as an association
measure, as is often done in shared frailty models.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 193
S(t1 , t2 ) = Ee−D(u0 )(M0 (t1 )+M0 (t2 ))−D1 (u)M0 (t1 )−D2 (u)M0 (t2 )
= e−u0 ψ(M0 (t1 )+M0 (t2 )) e−uψ(M0 (t1 )) e−uψ(M0 (t2 ))
−1 − ln S(t1 ) −1 − ln S(t2 ) u u
= e−u0 ψ ψ ( u0 +u )+ψ ( u0 +u ) S(t1 ) u0 +u S(t2 ) u0 +u .
Using the variance formula for Lévy processes V(D(u)) = uV(D(1) − D(0))
assuming finite second moments of D(0) and D(1) and because of the inde-
pendence of D(u0 ), D1 (u) and D2 (u) it holds for the correlation between the
two frailty variables
ρ = corr(Z1 , Z2 )
cov D(u0 ) + D1 (u), D(u0 ) + D2 (u)
=q
V D(u0 ) + D1 (u) V D(u0 ) + D2 (u)
V D(u0 )
=
V D(u0 ) + D1 (u)
u0
= . (5.19)
u0 + u
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
194 Frailty Models in Survival Analysis
Applying relation (5.19) we get the final form in the general correlated frailty
model
− ln S(t1 ) − ln S(t )
1−ρ
−1
1−ρ −u0 ψ ψ ( )+ψ −1 ( u +u2 )
S(t1 , t2 ) = S(t1 ) S(t2 ) e u0 +u 0 (5.20)
The correlated gamma frailty model can be obtained from (5.20) as a special
u
−1 u exponent from (3.59) ψ(u) = k ln(1+ λ )
case by substituting the characteristic
and its inverse ψ (u) = λ e − 1
k
− ln S(t1 ) − ln S(t )
1−ρ
−1
1−ρ −u0 ψ ψ ( )+ψ −1 ( u +u2 )
S(t1 , t2 ) = S(t1 ) S(t2 ) e u0 +u 0
“ ”
−1 − ln S(t1 ) −1 − ln S(t2 )
1−ρ 1−ρ −u0 k ln 1+1/λ ψ ( u0 +u )+ψ ( u0 +u )
= S(t1 ) S(t2 ) e
1−ρ 1−ρ − k(u 1+u) − k(u 1+u) −u0 k
= S(t1 ) S(t2 ) S(t1 ) 0 + S(t2 ) 0 −1 .
h 2 2 2 2 i−1/σ2
S(t1 , t2 ) = S(t1 )−σ S(t2 )−σ − ρ S(t1 )−σ − 1 S(t2 )−σ − 1
,
which is another correlated gamma frailty model, but not based on the additive
decomposition of the gamma-distributed frailties as the model in Section 5.2.
In the case of σ 2 = 2 the survival function of the correlated quadratic hazard
model in (5.17) is obtained. Furthermore, the shared gamma frailty model
from Section 4.3 is also a special case of the model introduced by Henderson
and Shimakura (2003) with ρ = 1.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 195
and let Tj denote the age at onset of the disease for the jth individual when
Yj = 1 (j = 1, 2). Then the bivariate survival function is of the form
This bivariate survival is often given as a copula. Chatterjee and Shih (2001)
considered three different copulas: the shared gamma frailty model (Claytons
model), Frank’s copula, and Hougaard’s shared positive stable frailty model.
For more details about copulas see Chapter 6. Chatterjee and Shih (2001)
applied a two-step estimation procedure to breast cancer using the kinship
data from the Washington Ashkenazi Study, but by ignoring the dependency
among different pairs within the same family. Their model can be extended
by substituting the shared gamma frailty model by the symmetric correlated
gamma frailty model (Wienke et al. 2003a)
2 2 ρ
S(t1 , t2 ) = S(t1 )1−ρ S(t2 )1−ρ (S(t1 )−σ + S(t2 )−σ − 1) σ2 .
In the following analysis a parametric model with Gompertz baseline hazard
was used, for example,
λ − 1
S(t) = S(0, t) = S(t, 0) = 1 + σ 2 (eϕt − 1) σ2 .
ϕ
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
196 Frailty Models in Survival Analysis
Example 5.9
The model was applied to the breast cancer incidence data of MZ and DZ
Swedish female twin pairs from the old cohort of the Swedish Twin Registry.
The results are given in Table 5.14.
We consider two different cases of cure models. In the first case, it is assumed
that the susceptible status of the individuals in a pair is independent, for
example, P(Y1 = p1 , Y2 = p2 ) = P(Y1 = p1 )P(Y2 = p2 ) with p1 , p2 ∈ {0, 1}.
The size of the cure fraction is uniquely described by the univariate probability
φ = P(Y1 = 1) = P(Y2 = 1), which results in φ10 = φ01 = φ(1 − φ),
φ11 = φ2 , and φ00 = (1 − φ)2 . In the second case, which is an extension
of the first one, the restriction of independence between the susceptibility
status of the partners in a pair is relaxed and substituted by the weaker
constraints φ10 = φ01 , φ11 + φ10 + φ01 + φ00 = 1. When comparing the
likelihoods, it turns out that the cure model with an independent susceptible
status of the partners shows a significantly better fit compared to the model
without a cure fraction. The more complicated cure model without assuming
independence between the susceptible status of the twin partners shows no
significant improvement compared to the cure model assuming independence.
Interestingly, the estimate of the size of a susceptible fraction (due to breast
cancer) with φ = 0.223 (0.046) is close to the estimate φ = 0.22 (0.0093) in the
shared gamma frailty model found by Chatterjee and Shih (2001) in a study
population that is completely different from the one used here. Nevertheless,
the estimates of the susceptible fraction in both models in Table 5.14 are
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 197
in the range of the results obtained by Farewell et al. (1977) for different
combinations of four risk factors. If none of the risk factors is present, the
susceptible fraction is around 0.015; if all risk factors are present, the estimate
increases to 0.272.
A simulation study was performed to check the properties of the estimates
in the proposed gamma frailty model. All simulations involve generating
gamma-distributed frailties, bivariate lifetimes, censoring times, as well as
the inclusion of a cured fraction in the study population. A total of 5000
twin pairs are simulated in each data set. Samples are generated to mimic
the structure of the data analyzed in Table 5.14:
The data sets are simulated assuming dependence between the susceptibility
status of the partners (second column in Table 5.14), but in the estimation
procedure, the more general model with independent susceptibility status was
applied (third column in Table 5.14). There were 1000 data sets simulated.
The mean of the parameter estimates and their standard errors are presented
in Table 5.15, in comparison with the true values. There appears to be only
moderate bias in the parameter estimates, and the overall performance is very
good. More information about this study is given in Wienke et al. (2003a).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
198 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 199
with Z = (Z11 , Z12 , Z21 , Z22 , . . . , Zn1 , Zn2 ) and λ, ϕ being the parameters of
the Gompertz baseline hazard function.
By definition of the model, the vector of frailties (Zi1 , Zi2 ) (i = 1, . . . , n)
in each pair is assumed to follow a bivariate log-normal distribution, with
variances σ12 = σ22 = σ 2 and correlation coefficient ρ. The parameters of the
baseline hazard, regression coefficients β, σ 2 , and ρ (the latter two being the
hyperparameters) are assumed to follow a noninformative distribution. We
adopt uniform priors over the intervals [1e-7, 0.005], [0.05, 0.15] and [-1,1]
for λ, ϕ and ρ, respectively. Furthermore, log-normal priors with mean 0.5
and variance 0.25 for σ 2 and multivariate normal priors for β are used. More
details about the MCMC approach can be found in Section 5.4.
We estimated Model 1 following a maximization procedure, and Models 2
and 3 using a numerical integration procedure (Gauss–Hermite quadrature).
MCMC methods are employed in all three models. We generated data sets
with different frailty distributions. First, we used σ 2 = 1 and ρ = 0.7. Second,
we used parameters σ 2 = 0.3 and ρ = 0.2. In both cases, λ = 0.003, ϕ = 0.07,
β ′ = (β1 , β2 ), β1 = 0.1, and β2 = −0.2. Two types of covariates were used.
One covariate was generated as binary variable
n
1 if i ≤ 2
Xij1 = n
0 if i > 2,
and the other as normal distributed variable Xij2 ∼ N (0, 1). Consequently,
the first covariate is pair-specific, whereas the second covariate is individual
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
200 Frailty Models in Survival Analysis
specific. We used sample sizes of 500 and 5000 pairs and simulated 500 data
sets in each case (50 data sets only for Bayesian methods because of time
constraints). Only the case of complete event times (meaning no censoring)
was considered here. Results are shown in Tables 5.16 – 5.18.
Table 5.16: Model 1 with two covariates, 500 simulated data sets (Bayes
50 data sets)
sample
method size λ ϕ σ2 ρ β1 β2
true 3.00e-3 0.070 0.300 0.200 0.100 -0.200
ML 500 3.01e-3 0.070 0.294 0.222 0.105 -0.198
(3.70e-4) (0.004) (0.087) (0.194) (0.082) (0.042)
ML 5000 3.01e-3 0.070 0.299 0.197 0.099 -0.200
(1.25e-4) (0.001) (0.027) (0.069) (0.026) (0.013)
Bayes 5000 3.04e-3 0.070 0.292 0.208 0.093 -0.196
(1.16e-4) (0.001) (0.024) (0.059) (0.027) (0.013)
true 3.00e-3 0.070 1.000 0.700 0.100 -0.200
ML 500 2.99e-3 0.070 1.001 0.699 0.107 -0.202
(4.03e-4) (0.005) (0.141) (0.080) (0.105) (0.054)
ML 5000 3.01e-3 0.070 1.000 0.700 0.098 -0.199
(1.34e-4) (0.002) (0.044) (0.023) (0.034) (0.017)
ML - maximum likelihood estimation
Bayes - MCMC estimation
The three models show the same pattern. As expected, the estimations for
the larger sample size are far more accurate. The most striking effect is the
strong negative correlation between estimates of ρ and σ 2 , independently of
the model and the estimation procedure (see Table 5.19). Similar simulations
were performed in the more general model with two different frailty variances
(one for each of the two individuals). The results are similar to those with
one frailty variance and are therefore omitted here.
As Bayesian methods are very time consuming, only 50 data sets with 5000
pairs each were generated for this analysis method. We run two parallel chains
from different starting points and considered the first 4000 iterations for each
chain as ”burn-in”.
The simulated values of parameters of random effects have auto-correlations
close to unity in our case. The convergence of the Markov chain is very slow.
Altogether, from 10000 up to 60000 iterations per chain were generated after
the ”burn-in” phase for each data set. The values of the Gelman–Rubin
statistics are quite close to one (see Section 5.4), indicating convergence of
the chains.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 201
Table 5.17: Model 2 with two covariates, 500 simulated data sets (Bayes
50 data sets)
sample
method size λ ϕ σ2 ρ β1 β2
true 3.00e-3 0.070 0.300 0.200 0.100 -0.200
ML* 500 3.02e-3 0.071 0.372 0.237 0.097 -0.203
(8.23e-4) (0.008) (0.378) (0.352) (0.080) (0.045)
ML* 5000 2.99e-3 0.070 0.308 0.204 0.099 -0.200
(2.04e-4) (0.002) (0.077) (0.083) (0.025) (0.013)
Bayes 5000 3.04e-3 0.070 0.300 0.218 0.095 -0.199
(2.01e-4) (0.002) (0.067) (0.089) (0.022) (0.012)
true 3.00e-3 0.070 1.000 0.700 0.100 -0.200
ML* 500 2.81e-3 0.075 1.283 0.689 0.113 -0.211
(1.06e-3) (0.014) (0.731) (0.186) (0.118) (0.059)
ML* 5000 3.07e-3 0.069 0.977 0.720 0.098 -0.199
(3.44e-4) (0.004) (0.173) (0.072) (0.034) (0.017)
Bayes 5000 3.12e-3 0.069 0.981 0.726 0.091 -0.198
(3.93e-4) (0.004) (0.193) (0.074) (0.031) (0.015)
ML* - maximum likelihood estimation with numerical integration
Bayes - MCMC estimation
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
202 Frailty Models in Survival Analysis
Table 5.18: Model 3 with two covariates, 500 simulated data sets (Bayes
50 data sets)
sample
method size λ ϕ σ2 ρ β1 β2
true 3.00e-3 0.070 0.300 0.200 0.100 -0.200
ML* 500 3.01e-3 0.071 0.361 0.243 0.095 -0.204
(4.22e-4) (0.007) (0.297) (0.358) (0.078) (0.043)
ML* 5000 2.99e-3 0.070 0.308 0.204 0.099 -0.200
(1.38e-4) (0.002) (0.075) (0.082) (0.025) (0.013)
Bayes 5000 3.03e-3 0.070 0.302 0.218 0.095 -0.199
(1.36e-4) (0.002) (0.070) (0.092) (0.022) (0.012)
true 3.00e-3 0.070 1.000 0.700 0.100 -0.200
ML* 500 3.00e-3 0.075 1.323 0.683 0.107 -0.212
(4.35e-4) (0.015) (0.998) (0.160) (0.117) (0.064)
ML* 5000 3.00e-3 0.070 1.022 0.701 0.099 -0.201
(1.46e-4) (0.004) (0.179) (0.067) (0.034) (0.018)
Bayes 5000 3.02e-3 0.070 1.000 0.713 0.102 -0.199
(1.30e-4) (0.003) (0.134) (0.058) (0.034) (0.015)
ML* - maximum likelihood estimation with numerical integration
The results of the simulation study are very clear. The observed effect is
stable over different frailty distributions and different estimation strategies.
Moreover, different choices of parameters and sample sizes did not change the
correlation.
The present study is limited to parametric correlated frailty models. An
open question remains whether the observed negative correlation between
the parameter estimates is also present in semiparametric correlated frailty
models. This is a topic for future research.
A high correlation of parameter estimates could be a sign of identifiability
problems in the model. Correlated frailty models were investigated in order
to overcome the problems of the shared frailty models, which provide only
one parameter to model variance and correlation. One idea was to include
observed covariates into the models to improve identifiability characteristics.
This is why all models were run both with and without observed covariates.
The results in both cases are very similar. Consequently, we dropped results
for models without observed covariates. Two types of covariates were used,
one dichotomous and one continuous. However, no effect of the covariates on
the correlation between the parameter estimates was detected.
Regarding identifiability, note that heterogeneity (variance of frailty) and
correlation between frailties (implying dependence of lifetimes) are correlated
in a frailty model based on conditional independence. To see this, assume
that the variance of the frailty tends to zero. Obviously, this implies zero
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 203
Table 5.19: Models 1–3 with two covariates, 500 simulated
data sets (Bayes 50 data sets)
model method sample size corr(ρ, σ 2 ) parameters
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
204 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 205
T1∗ ∼ µ1 (t1 |Z1 ) = Z1 µ01 (t1 ) T2∗ ∼ µ1 (t2 |Z3 ) = Z3 µ01 (t2 )
Y1 ∼ µ2 (y1 |Z2 ) = Z2 µ02 (y1 ) Y2 ∼ µ2 (y2 |Z4 ) = Z4 µ02 (y2 ), (5.24)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
206 Frailty Models in Survival Analysis
ρ ρ
ρ2
second cause of death Z2 = λλ02 V1 + V4 + V5 V3 + V7 + λλ20 V8 = Z4
Z1 , Z3 denote frailties with respect to the main cause of death (the cause
under study) and Z2 , Z4 are the frailties with respect to the second cause of
death. The parameters ρ1 , ρ2 , and ρ describe correlations between the frailties:
ρ1 = corr(Z1 , Z3 ), ρ2 = corr(Z2 , Z4 ), and ρ = corr(Z1 , Z2 ) = corr(Z3 , Z4 ).
The four-dimensional survival function is obtained by integrating out the
conditional lifetimes with respect to the frailty distribution by using (5.24)
and applying the Laplace transform of gamma-distributed random variables
(see Appendix A.5):
S(t1 , y1 , t2 , y2 ) (5.25)
Z1 Z2 Z3 Z4
= ES1 (t1 ) S2 (y1 ) S1 (t2 ) S2 (y2 )
2 2 − ρ12 2 2 − ρ22
= S1 (t1 )−σ1 + S1 (t2 )−σ1 − 1 σ1 S2 (y1 )−σ2 + S2 (y2 )−σ2 − 1 σ2
2 2 − ρ 2 2 − ρ
× S1 (t1 )−σ1 + S2 (y1 )−σ2 − 1 σ1 σ2 S1 (t2 )−σ1 + S2 (y2 )−σ2 − 1 σ1 σ2
σ σ σ σ
1−ρ1 − σ1 ρ 1−ρ1 − σ1 ρ 1−ρ2 − σ2 ρ 1−ρ2 − σ2 ρ
× S1 (t1 ) 2 S1 (t2 ) 2 S2 (y1 ) 1 S2 (y2 ) 1
with the constraint 0 ≤ ρ ≤ min{ σσ12 (1−ρ1 ), σσ12 (1−ρ2 )}. Parameter ρ1 denotes
the correlation between Z1 and Z3 . This parameter measures the correlation
between frailties of partners in a pair with respect to the cause of death
under investigation and is important for genetic analysis of susceptibility to
cause-specific mortality. Parameter ρ2 models the correlation between frailties
regarding all other causes of death (combined to the second cause of death or,
more general, informative censoring). Parameter ρ describes the association
between the unobservable cause-specific lifetimes in each individual. This
parameter allows to test the hypothesis of dependence between competing
risks. S1 and S2 denote the marginal survival functions regarding the first
and second cause of death. The likelihood function of the model can be found
in Appendix A.5.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Correlated Frailty Models 207
ρ2
2 2 −
× S2 (y1 )1−ρ2 S2 (y2 )1−ρ2 (S2 (y1 )−σ2 + S2 (y2 )−σ2 − 1) σ2
2 .
If ρ1 = 0 and ρ2 = 0 holds (unrelated individuals), the model simplifies to
σ σ
1− σ1 ρ 1− σ2 ρ 2 2 − σ ρσ
S(t1 , y1 , t2 , y2 ) = S1 (t1 ) 2 S2 (y1 ) 1 (S1 (t1 )−σ1 + S2 (y1 )−σ2 − 1) 1 2
σ σ ρ
1− σ1 ρ 1− σ2 ρ −σ12 −σ22
× S1 (t2 ) 2 S2 (y2 ) 1 (S1 (t2 ) + S2 (y2 ) − 1)− σ1 σ2 .
Using cause-specific mortality data of relatives (for example, twins), it is
possible to solve nonidentifiability problems in univariate censored lifetimes
as investigated by Tsiatis (1975). The model enables dependencies between
competing risks and allows to test for such dependencies.
The consistency and asymptotic normality of the estimators in this model
are not proofed yet, but simulation results (not shown here) indicate the
asymptotic validity of the proposed model.
The correlation coefficients between the frailties are always nonnegative,
which is clearly a limitation of the proposed model. This restriction poses no
problem when analyzing the lifetimes of relatives, where a positive association
between lifetimes seems reasonable. However, it is not clear that the same
holds for the competing risks in an individual. On the one hand, many major
diseases have risk factors in common, and consequently, the presence of any
one of these risk factors will increase the risk of death with respect to all
diseases. On the other hand, everyone dies eventually, so it is only logical
that if the risk of death from one cause is decreased, then the risk from
another cause is increased. Furthermore, the parameter ρ is only identifiable
in a ”real” multivariate case (cluster size larger than two). Having pairs
of unrelated individuals (e.g., ρ1 = ρ2 = 0), implying the univariate case,
makes the parameter ρ nonidentifiable. The nature of dependencies among
competing risks deserves further study.
A similar model for current state data was established in Giard (2001) and
Giard et al. (2002). A more general approach compared to model (5.25)
was investigated by Bandeen-Roche and Liang (1996). The difference in the
models is in the observed data. In the model just shown only the minimum
of two competing lifetimes in each individual is observed, whereas Bandeen-
Roche and Liang (1996) assume that all lifetimes are observable. Another
four-dimensional correlated gamma frailty model is proposed by Jonker and
Boomsma (2010).
Huang and Wolfe (2002) based their model on log-normal frailty and (4.17),
which means a model somewhere between shared and correlated frailty models.
In a recent paper, Huang et al. (2004) suggested a test procedure to test the
hypothesis of dependence between survival and censoring times in their model.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Chapter 6
Copula Models
209
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
210 Frailty Models in Survival Analysis
S(t1 , t2 )
S(t1 |T2 > t2 ) =
S2 (t2 )
and that of T1 given T2 = t2 is
∂S(t1 ,t2 )
∂t2
S(t1 |T2 = t2 ) = ∂S2 (t2 )
.
∂t2
The respective conditional hazard functions are important for the following
′
considerations. Using the relation µ(t) = − SS(t)
(t)
implies
∂
µ(t1 |T2 > t2 ) = − ln(S(t1 , t2 )) (6.1)
∂t1
and
∂ ∂
µ(t1 |T2 = t2 ) = − ln − S(t1 , t2 ) . (6.2)
∂t1 ∂t2
These hazards describe the risk of failure at age t1 for the first individual,
given the information about the event status of the second individual in the
pair. The first hazard (6.1) uses the condition {T2 > t2 }, and the second one
(6.2) is conditional on {T2 = t2 }. The deviation of the ratio of these hazards
from one was used by Oakes (1989) as a measure of mutual dependence of
the respective marginal lifetimes. The survival function of the shared gamma
frailty model can now be obtained from the following relation between the
above hazards
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Copula Models 211
−∂ −∂
S2 (t2 ) = (1 + σ 2 ) ln(S(t1 , t2 )) − ln(S2 (t2 ))
ln S(t1 , t2 ) − ln
∂t2 ∂t2
−∂ 2 −∂ 2
S(t1 , t2 ) − ln(S(t1 , t2 )1+σ ) = ln S2 (t2 ) − ln(S2 (t2 )1+σ )
ln
∂t2 ∂t2
∂ ∂
∂t2 S(t ,
1 2t ) ∂t2 S2 (t2 )
2 = .
S(t1 , t2 )1+σ S2 (t2 )1+σ2
t2 ∂ t2 ∂
∂t S(t1 , t) ∂t S2 (t)
Z Z
dt = dt
0 S(t1 , t)1+σ2 0 S2 (t)1+σ2
2 2 2
S(t1 , t2 )−σ − S1 (t1 )−σ = S2 (t2 )−σ − 1
2 2 − 1
S(t1 , t2 ) = S1 (t1 )−σ + S2 (t2 )−σ − 1 σ2
.
Thus, there are two ways of deriving formula (4.3) based on two radically
different concepts: one uses the assumption (6.3), concerning proportionality
of conditional hazards; the other uses the concept of gamma-distributed shared
frailty. In the literature it has often been claimed that particular copulas can
be deduced from frailty models by choosing the appropriate frailty distribution.
This happens because the survival functions are similar in both approaches,
at least at first sight. It is necessary to note that there exists one important
difference between the survival functions resulting from both approaches. In
the shared gamma frailty model, transformation (3.17) is used to generate
the bivariate survival function. Consequently, the marginal survival functions
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
212 Frailty Models in Survival Analysis
S1 and S2 include the frailty parameter σ 2 , which is not the case in the
approach using proportional conditional hazards (6.3). The latter model is
a copula model. Consequently, it is necessary to make assumptions about
the data generation mechanism. This is important for the interpretation of
the data analysis; for more details see Goethals et al. (2008). The bivariate
representation (6.3) was extended to the multivariate setting by Guo and
Rodriguez (1992) and Guo (1993).
Fitting copula models is often performed by using a two-stage procedure
(Shih and Louis 1995a, Glidden 2000, Andersen 2005). In the first stage the
marginal survival functions are estimated without taking into account the
clustering of the data. This can be done in a parametric or nonparametric
way. A semiparametric approach is also possible by assuming a Cox model and
therefore the inclusion of covariates (Glidden 2000, Andersen 2005). It turns
out that these estimates are consistent (Spiekerman and Lin 1998) and can be
used in the second step to estimate the copula parameters. Alternatively, the
likelihood can also be maximized in a one-step procedure. In the next section
we consider the copula of the correlated gamma frailty model to illustrate the
difference between the copula and the frailty approach.
with respect to their density f (z1 , z2 ). This integral can be solved analytically,
and the joint survival function of the correlated gamma frailty model is now
(see 5.6)
σ1 σ2
S1 (t1 )1− σ2 ρ S2 (t2 )1− σ1 ρ
S(t1 , t2 ) = 2 2
ρ . (6.4)
(S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) σ1 σ2
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Copula Models 213
The joint survival function in the copula model looks completely the same. To
understand the difference between the correlated gamma frailty model and the
related copula, it is necessary to keep in mind how formula (6.4) was derived.
1
− 2
The formula is obtained by substituting the expressions (1 + σj2 M0j (tj )) σj
(j = 1, 2) by the marginal survival functions Sj (tj ) in (5.5) using (3.17).
Consequently, in the frailty model, the survival functions Sj (tj ) depend on
the variance of the frailties σj2 . This is often overlooked when dealing with
representation (6.4). In the copula approach the derivation in (5.5) is no
longer of interest, formula (5.6) is just a bivariate survival function without
any frailty interpretation. It should be noted that the copula in (6.4) is not an
Archimedian copula in contrast to the Clayton copula, which is obtained as a
special case with ρ = 1. Here, the marginal survival functions are completely
unrestricted and can follow any distribution. That is the reason why a two-
step procedure can be applied by estimating the marginal survival functions
in the first step. The assumption that the marginal survival functions depend
on the frailty variances is no longer used. As a consequence, in frailty models,
such a two-step estimation procedure is impossible. In the frailty model the
choice of the frailty distribution determines the functional form of the copula
as well as the functional form of the marginal distributions, which is different
compared to the copula model. It is important to understand the difference
between the two approaches because the results and their interpretation will
not be the same. We illustrate this problem in the following example by
reanalyzing the data of male Danish twins in Example 5.1 with respect to
cancer.
Example 6.1
Because of the symmetry in the twin data, the restrictions S(t) = S1 (t) = S2 (t)
and σ 2 = σ12 = σ22 are imposed. A parametric approach with a Gompertz
baseline hazard function is considered. In the case of the gamma frailty models
this results in
λ 1
S(t) = (1 + σ 2 (eϕt − 1))− σ2 .
ϕ
In the copula approach, no restrictions about the form of the marginal survival
functions exist. This can, for example, be incorporated into the model by the
specification
λ 1
S(t) = (1 + s2 (eϕt − 1))− s2 ,
ϕ
where s2 denotes a new parameter of the marginal distribution. In some
sense, s2 is interpretable as the variance of a univariate gamma-distributed
frailty. The correlated gamma frailty model is characterized by the condition
σ 2 = s2 . Consequently, using this parameterization, the frailty model is a
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
214 Frailty Models in Survival Analysis
The difference between the models in the first two columns is obvious. In the
copula model, σ 2 has no longer an interpretation as a frailty variance, and
ρ is no longer interpretable as a correlation between frailties. Both ρ and
σ 2 are parameters of the copula without any relation to random effects. In
contrast, the parameter s2 has an interpretation as variance of the univariate
frailty in the marginal distributions. Consequently, the foregoing approach can
be used to test whether the frailty model fits the data. If both parameters
σ 2 and s2 are similar, the frailty model provides a good fit. If there is a
significant difference between the two parameters, this contradicts the frailty
model and speaks in favor of the less restrictive copula model. In the situation
considered here, the likelihood ratio test indicates no significant difference
between the models, preferring the more simple frailty model. This fact is also
supported by the large standard error of σ 2 in the copula model. Interestingly,
semiparametric analysis (last column) reveals similar results compared to the
parametric copula model. This speaks in favor of the hypothesis that the
parametric analysis is not sensitive to the choice of the hazard function. In
the semiparametric copula model, a one-step estimation procedure is used,
where the Kaplan–Meier–estimator is plugged into the likelihood function
to substitute the unknown marginal survival functions. Unfortunately, this
approach is not possible in the frailty model, where the more demanding EM
algorithm is needed to analyze the semiparametric model. The reason for
this difference is the already mentioned dependence of the marginal survival
functions on the frailty variance in the frailty model, which is not present in
the copula approach.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Copula Models 215
1−ρ
= S1 (t1 )S2 (t2 ) S(t1 , t2 )ρ (6.6)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
216 Frailty Models in Survival Analysis
THEOREM 6.1
(Yashin and Iachine 1999a) Let S(t1 , t2 ) be the joint survival function of T1
and T2 given by formula (6.5) with marginals S1 (t1 ) and S2 (t2 ). Furthermore,
Rt Rt
let A(t1 , t2 ) = 0 1 0 2 φ(u, v)dudv, with φ(u, v) ≥ 0 for all u ≥ 0, v ≥ 0. Then,
for any 0 ≤ ρ ≤ 1,
Proof: It follows from (6.6) that S̃(t1 , 0) = S1 (t1 ), and S̃(0, t2 ) = S2 (t2 ), and
that S̃(t1 , ∞) = S̃(∞, t2 ) = S̃(∞, ∞) = 0. Hence, Ã(t1 , t2 ) = ρA(t1 , t2 ). To
2
complete the proof, it is enough to show that S̃t1 t2 (t1 , t2 ) = ∂ ∂t S̃(t1 ,t2 )
1 ∂t2
≥ 0.
Using relation (6.6) we obtain
S̃t1 t2 (t1 , t2 ) = S̃(t1 , t2 ) ((ρ − 1)µ1 (t1 ) − ρµ1 (t1 , t2 ))
× ((ρ − 1)µ2 (t2 ) − ρµ2 (t1 , t2 )) + ρφ(t1 , t2 ) (6.8)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Copula Models 217
Consequently, the copula of the correlated gamma frailty model can be derived
from the copula of the shared gamma frailty model without any interpretation
related to frailty. The association function φ(t1 , t2 ) in this case is of the form
2 2
σ 2 µ1 (t1 )µ2 (t2 )S1 (t1 )−σ S2 (t2 )−σ
φ(t1 , t2 ) = . (6.9)
(S1 (t1 )−σ2 + S2 (t2 )−σ2 − 1)2
h 1/γ 1/γ γ i
S̃(t1 , t2 ) = S1 (t1 )1−ρ S2 (t2 )1−ρ exp ρ − ln S1 (t1 ) + − ln S2 (t2 ) .
It should be noted that a correlated positive stable frailty model similar to the
other correlated frailty models with explicit joint survival function does not
exist, because the moments of the positive stable distribution are all infinite.
Consequently, no correlation between the frailties in a cluster exist, which
makes the interpretation of the parameter ρ more difficult. The association
function becomes
1−γ 1 1 γ−2
φ(t1 , t2 ) = µ1 (t1 )µ2 (t2 ) − ln S1 (t1 ) γ + − ln S2 (t2 ) γ
γ
1 −1 1 −1
× − ln S1 (t1 ) γ − ln S2 (t2 ) γ .
The same extension holds also in the more general case of the compound
Poisson model, including the gamma and the inverse Gaussian model as
special cases. Starting with the copula of the shared compound Poisson
frailty model (4.8), the copula of the correlated compound Poisson frailty
model (5.14) is obtained by applying relation (6.6):
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
218 Frailty Models in Survival Analysis
φ(t1 , t2 )
γσ 2 1 γσ 2 1 γ−2
= σ 2 µ1 (t1 )µ2 (t2 ) 1 − ln S1 (t1 ) γ + 1 − ln S2 (t2 ) γ − 1
1−γ 1−γ
γσ 2 γ1 −1 γσ 2 γ1 −1
× 1− ln S1 (t1 ) 1− ln S2 (t2 ) .
1−γ 1−γ
The following theorem extends the general copula model considered previously
to the case of negative ρ, resulting in negative association between survival
times in a cluster.
THEOREM 6.2
(Yashin and Iachine 1999a) Let the conditions of Theorem 6.1 hold. Then,
for any ρ satisfying
µ (t )µ (t )
1 1 2 2
max − < ρ < 0, (6.10)
t1 ,t2 φ(t1 , t2 )
the function S̃(t1 , t2 ) given by (6.6) determines a bivariate distribution of
negatively correlated survival times.
(ρ − 1)µ1 (t1 ) − ρµ1 (t1 , t2 ) (ρ − 1)µ2 (t2 ) − ρµ2 (t1 , t2 ) ≥ µ1 (t1 )µ2 (t2 ) ≥ 0
S̃t1 t2 (t1 , t2 ) ≥ S̃(t1 , t2 ) µ1 (t1 )µ2 (t2 ) + ρφ(t1 , t2 )
Z ∞ Z ∞ Z ∞ Z ∞
cov(T̃1 , T̃2 ) = S̃(u, v) du dv − S1 (u)S2 (v) du dv.
0 0 0 0
So, if A(u, v) > 0, then the sign of cov(T̃1 , T̃2 ) coincides with the sign of ρ.
This completes the proof.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Copula Models 219
2 2 2
µ1 (t1 )µ2 (t2 ) S1 (t1 )−σ + S2 (t2 )−σ − 1 −1
max − = max − =
t1 ,t2 φ(t1 , t2 ) t1 ,t2 σ 2 S1 (t1 )−σ2 S2 (t2 )−σ2 σ2
by using (6.9). If σ 2 is smaller than one, the parameter ρ can take on all
values between -1 and 1. In the correlated positive stable copula, it turns out
that
µ (t )µ (t )
1 1 2 2
max −
t1 ,t2 φ(t1 , t2 )
1 1 2−γ
γ − ln S1 (t1 ) γ + − ln S2 (t2 ) γ
= max − 1 −1 1 −1 = 0,
t1 ,t2
(1 − γ) − ln S1 (t1 ) γ − ln S2 (t2 ) γ
µ (t )µ (t )
1 1 2 2
max −
t1 ,t2 φ(t1 , t2 )
1 1 2−γ
γσ2 γσ2
1 − 1−γ ln S1 (t1 ) γ + 1 − 1−γ ln S2 (t2 ) γ − 1 −1
= max − 1 1 = 2,
t1 ,t2 γσ2
σ 2 1 − 1−γ
−1
ln S1 (t1 ) γ γσ2
1 − 1−γ
−1
ln S2 (t2 ) γ σ
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
220 Frailty Models in Survival Analysis
µ(t1 |T2 = t2 )
θ(t1 , t2 ) = .
µ(t1 |T2 > t2 )
This expression has a nice interpretation. For twins considered as example
through the last chapter, the ratio compares the hazard of the first partner
to experience the event at time t1 , given that the second partner experiences
the event at time t2 , to the hazard of the first partner at time t1 , given that
the second partner experience the event later than time t2 . It is often helpful
to write the cross-ratio function in terms of the survival function and their
derivatives (Oakes 1989):
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Copula Models 221
2 2 ρ
St1 (t1 , t2 )St2 (t1 , t2 ) = S(t1 )−ρ f (t1 )S(t2 )1−ρ (S(t1 )−σ + S(t2 )−σ − 1)− σ2 −1
2 2
× − S(t1 )−σ − (1 − ρ)S(t2 )−σ + (1 − ρ)
2 2 ρ
× S(t1 )1−ρ S(t2 )−ρ f (t2 )(S(t1 )−σ + S(t2 )−σ − 1)− σ2 −1
2 2
× − (1 − ρ)S(t1 )−σ − S(t2 )−σ + (1 − ρ)
2 2 ρ
= S(t1 )−ρ f (t1 )S(t2 )1−ρ (S(t1 )−σ + S(t2 )−σ − 1)− σ2 −1
2 2 ρ
× S(t1 )1−ρ S(t2 )−ρ f (t2 )(S(t1 )−σ + S(t2 )−σ − 1)− σ2 −1
2 2 2
× (1 − ρ)(S(t1 )−σ + S(t2 )−σ − 1) + ρS(t1 )−σ
2 2 2
× (1 − ρ)(S(t1 )−σ + S(t2 )−σ − 1) + ρS(t2 )−σ .
Using the preceding expressions the cross-ratio function (6.11) of the correlated
gamma copula can be obtained as
θ(t1 , t2 ) =
2 2
ρσ 2 S(t1 )−σ S(t2 )−σ
1+ .
(1 − ρ)(S(t2 )−σ2 − 1) + S(t1 )−σ2 (1 − ρ)(S(t1 )−σ2 − 1) + S(t2 )−σ2
and
ρσ 2
lim θ(t1 , t2 ) = 1 + .
t1 ,t2 →∞ (2 − ρ)2
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
222 Frailty Models in Survival Analysis
θ(t1 , t2 ) =
ρσ 2 a(t1 )1/γ−1 (a(t1 )1/γ + a(t2 )1/γ − 1)γ
1+
(1 − ρ)(a(t1 )1/γ + a(t2 )1/γ − 1) + ρa(t1 )1/γ−1 (a(t1 )1/γ + a(t2 )1/γ − 1)γ
a(t2 )1/γ−1
× .
(1 − ρ)(a(t1 )1/γ + a(t2 )1/γ − 1) + ρa(t2 )1/γ−1 (a(t1 )1/γ + a(t2 )1/γ − 1)γ
σ2
θ(t1 , t2 ) = 1 +
(a(t1 )1/γ + a(t2 )1/γ − 1)γ
(see Duchateau and Janssen 2008). The gamma model is obtained as limiting
1/γ−1 2 1/γ 2
case for γ → 0. It turns out that aj → S(tj )−σ and aj → S(tj )−σ
(j = 1, 2). Furthermore, it holds (a(t1 )1/γ +a(t2 )1/γ −1)γ → 1. Consequently,
θ(t1 , t2 )
ρσ 2 a(t1 )1/γ−1 (a(t1 )1/γ + a(t2 )1/γ − 1)γ
=1+
(1 − ρ)(a(t1 )1/γ + a(t2 )1/γ − 1) + ρa(t1 )1/γ−1 (a(t1 )1/γ + a(t2 )1/γ − 1)γ
a(t2 )1/γ−1
×
(1 − ρ)(a(t1 )1/γ + a(t2 )1/γ − 1) + ρa(t2 )1/γ−1 (a(t1 )1/γ + a(t2 )1/γ − 1)γ
2
ρσ 2 S(t1 )−σ
→1+ 2
(1 − ρ)(S(t1 )−σ + S(t2 )−σ2 − 1) + ρS(t1 )−σ2
2
S(t2 )−σ
×
(1 − ρ)(S(t1 )−σ2 + S(t2 )−σ2 − 1) + ρS(t2 )−σ2
2 2
ρσ 2 S(t1 )−σ S(t2 )−σ
=1+
(1 − ρ)(S(t2 )−σ2 − 1) + S(t1 )−σ2 (1 − ρ)(S(t1 )−σ2 − 1) + S(t2 )−σ2
The cross-ratio function at the beginning of the follow-up for both individuals
is θ(0, 0) = 1 + ρσ 2 . In the case ρ = 0, the two lifetimes are independent.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Chapter 7
Different Aspects of Frailty
Modeling
223
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
224 Frailty Models in Survival Analysis
It turns out that such models can be identified from bivariate (multivariate)
data when they are embedded into a frailty model by using the relation
σ(X) = σeγX ,
with Wi denoting the additional random treatment effects specific to the ith
center, assumed to follow a normal distribution with zero expectation and
some unknown variance. The proposed MCEM algorithm allows estimating
the random effects, which could be, for example, used to rank the centers with
respect to their treatment success.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Different Aspects of Frailty Modeling 225
Some research efforts have been undertaken to study extensions of the shared
frailty approach to more general models allowing, for example, for interaction
between treatment and center in a multicenter clinical trial (Yamaguchi and
Ohashi 1999, Vaida and Xu 2000, Legrand et al. 2005, 2006, Massonnet et
al. 2008) or interaction between treatment and study in a meta-analysis with
event time outcome (Rondeau et al. 2008). The hazard function in this model
is
′
Xij +W0i +W1i Xij1
µ(t|Xij , Wi ) = µ0 (t)eβ , (7.3)
with Wi′ = (W0i , W1i ), where W0i denotes the shared random effect and
W1i Xij1 the random interaction term with covariate Xij1 , which could be, for
example, a binary treatment variable. Here, the covariate Xij1 is usually a
component of the vector Xij . The random effects W0i and W1i are assumed
to follow a bivariate Gaussian distribution with
2
W0i 0 σ0 ρσ0 σ1
∼N , .
W1i 0 ρσ0 σ1 σ12
There are two sources of heterogeneity between clusters included in the model.
First, heterogeneity between clusters may arises from the treatment itself,
meaning the treatment is stronger in some clusters than in others. Such
treatment by center interaction may reflect unobserved differences in patient
characteristics and in implementation of the study protocol. This is accounted
for by the random interaction between treatment and cluster. Second, the
variation in outcomes between clusters may be attributed to differences in the
baseline hazards reflecting, for example, differences in medical practices or in
patient populations. Such kind of heterogeneity can be accounted for by the
shared frailty term in the random effects model. A similar model based on
the AFT approach can be found in Komárek et al. (2007).
This model was considered with independent random effects by several
authors. It can be estimated by using an extension of the REML approach
suggested by McGilchrist (1993) to accommodate for two random effects
(Yamaguchi and Ohashi 1999). Vaida and Xu (2000) used a Monte Carlo
EM algorithm with MCMC sampling in the E step. Legrand et al. (2005,
2006) proposed a Bayesian approach.
Ripatti and Palmgren (2000) used the penalized partial likelihood approach
combined with a Laplace approximation in a model with dependent random
effects. Massonnet et al. (2008) applied a transformation technique to estimate
the parameters in this model. Rondeau et al. (2008) suggested a penalized
marginal likelihood method, resulting in smoothed estimates of the hazard
function. Allowing for a possible correlation between the two random effects,
the heterogeneity of the baseline hazard between clusters may depend on the
heterogeneity of the treatment effect. This seems to be reasonable because the
treatment effect is usually expected to be larger in clusters with high baseline
hazard, and vice versa.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
226 Frailty Models in Survival Analysis
′ ′
Xij +Yij
µ(t|Xij , Wi ) = µ0 (t)eβ Wi
, (7.4)
where Xij and Yij denote the covariate vectors for the fixed and random
effects. Wi is the random effect from the ith cluster. The random effects are
assumed to follow a Gaussian distribution with expectation vector zero and
some covariance matrix. Vaida and Xu (2000) considered the case of diagonal
covariance matrices, resulting in independent random effects. In (7.4) Yij is
often a subset of Xij , apart from possibly an ”1” which represents the cluster
effect on the baseline. For a better understanding, assume for a moment that
′
Yij = (1, Xij ) and Wi′ = (W0i , W1i ). Each component of W1i represents
the interaction between the cluster and the respective covariate, while the
corresponding element of the parameter vector β represents the main effect
of the covariate. W0i is the random effect shared by all individuals in the ith
cluster. Vaida and Xu (2000) obtain maximum likelihood estimates of the
regression parameters, the variances of the random effects, and the baseline
hazard function via a modified EM algorithm. In the E step, MCMC methods
are used for the calculation of conditional expectations of functions of the
random effects. Ripatti and Palmgren (2000) used Laplace approximation
of the likelihood function in combination with a penalized partial likelihood
approach, where the marginal distribution of the random effects determines
the penalty term.
The flexibility of the above model is demonstrated best by discussing some
of its interesting submodels. The univariate model by Li et al. (2002) given
in (7.1) with parameter specification α = 1 is a special case of this model with
′
cluster size one, Yij = (1, Xi ), Wi′ = (Wi , Wi ), and Xij = Xi as a binary
covariate. The univariate log-normal frailty model with general covariates is
obtained with cluster size one, Yij = 1, and a one-dimensional random effect
Wi . Furthermore, the model considered by Xu (2004) and specified in (7.2)
is a special case with Yij = Xij , Wi as a one-dimensional random effect,
and Xij as a binary variable. The above model simplifies to the shared log-
normal frailty model with Yij = 1 and a one-dimensional random effect Wi .
Furthermore, the interaction model (7.3) considered in the last section is a
′
submodel via Yij = (1, Xij1 ) and Wi′ = (W0i , W1i ).
In the following section, nested (hierarchical) frailty models are considered.
It is necessary to note that nested frailty models in general are not included
in model class (7.4).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Different Aspects of Frailty Modeling 227
′
µ(t|Xijk , Zi , Zij ) = Zi Zij µ0 (t)eβ Xijk
, (7.5)
where Xijk denotes the covariate vector of the kth individual in the jth sub-
cluster belonging to the ith cluster. Frailty Zi denotes the cluster-specific
random effect, and Zij is the subcluster-specific random effect. Both Zi and
Zij are independent gamma-distributed random variables with mean one and
variances σ12 and σ22 , respectively.
Nested frailty models account for the hierarchical clustering by including
cluster-level-specific random effects. However, the extension of the estimation
procedures of the shared frailty model to the case of two or more frailty
terms is difficult. Sastry (1997) first adapted the EM algorithm to the nested
frailty model. Manda (2001) used Bayesian methods in a nested gamma
frailty model with piecewise constant baseline hazard. Rondeau et al. (2006)
applied a semiparametric maximum penalized likelihood estimation procedure
to model (7.5), which provides as a by-product a smooth estimator of the
hazard function.
More complicated nested frailty models allowing for dependence between
the cluster and subcluster-specific frailties are considered by Ma et al. (2003)
using a Poisson modeling approach, and by Shih and Lu (2009). Yau (2001)
suggested a nested log-normal frailty model of the type
′
µ(t|Xijk , Wi , Wij ) = µ0 (t)eβ Xijk +Wi +Wij (7.6)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
228 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Different Aspects of Frailty Modeling 229
The model introduced by Prentice et al. (1981) measures the conditional risk
of experiencing an event. An individual is only at risk for an event from the
occurrence of the previous event until the occurrence of this event or censoring.
Therefore, an event-specific baseline hazard is assumed. In general, the model
by Prentice et al. (1981) is approximating the real situation best.
In the models by Prentice et al. (1981) and Wei et al. (1989), in-subject
correlation is accounted for through stratification of the baseline hazard. In
the Andersen–Gill model, independent increments are assumed. All models
can be expanded to allow for between-subject correlation. As proposed by
McGilchrist (1991), a subject-specific frailty term can be used in the hazard.
For all observations belonging to one subject, the same frailty term is used.
The frailty variables of different subjects are independent realizations of a
common frailty distribution.
Duchateau et al. (2003) applied different parametric and nonparametric
models, both with and without frailty, to recurrent asthma events from an
asthma prevention trial in young children. One problem, however, is the
decision about the time scale to use. In the gap-time approach, only the time
until the occurrence of the last event is considered. An alternative time scale
is total time, measuring the time from the beginning of follow-up, disregarding
other events having occurred meanwhile.
Frailty models specially designed for recurrence data are considered in detail
by Aalen et al. (1995), McGilchrist and Yau (1996), Yau and McGilchrist
(1998), and Manda and Meyer (2005). An overview in the field is given in
Duchateau et al. (2003) and Lim et al. (2007).
The observation of recurrent event times could usually be terminated by loss
to follow-up, end of study, or a terminal event such as death. In the analysis
of recurrent even times discussed above, the assumption of noninformative
censoring of the recurrent event process by death is made, which can be
violated. For example, the recurrence of serious events, such as tumors or
opportunistic infections, is often associated with an increased risk of death.
To circumvent this problem, joint frailty models provide a useful alternative.
Here, the recurrent event hazard and the death hazard are modeled separately
but share some random effect, causing dependence between the recurrent event
times and the lifetime. The model can be specified by the hazard functions
′
µ1 (t|Xi1 , Zi ) = Zi µ01 (t)eβ1 Xi1 (7.7)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
230 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Different Aspects of Frailty Modeling 231
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
232 Frailty Models in Survival Analysis
Jeong (2003) extends the foregoing results to visualize and quantify the loss of
efficiency of the log-rank test when a dependence structure between survival
and censoring times is being ignored. The assumed dependence structure is
based on a correlated gamma frailty model (Section 5.2). In the given situation
the loss of efficiency is minimal under the proportional hazards model, even
when the correlation between potential survival and censoring times is strong,
unless the dependent censoring causes a severe nonproportionality.
Broe̋t et al. (1999) suggest a more practical approach for taking into account
unobserved covariates in a weighted log-rank test. The construction of the test
is based on a gamma distribution of the frailty. For the frailty parameter σ 2 ,
a range is assumed. Simulations investigate the power of the test for different
frailty distributions.
Rank tests for clustered event time data when dependent subunits are
randomized are considered by Jeong and Jung (2006).
Li et al. (2002) arrive at the conclusion that the log-rank test is nearly
fully efficient relative to the optimally weighted log-rank test if the unobserved
heterogeneity does not interact with the binary treatment variable. This is
not the case if frailty interacts with treatment. Such situation could happen,
for example, if unobserved heterogeneity is caused by genetic factors, which
interact with treatment.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Different Aspects of Frailty Modeling 233
In a second approach, McGilchrist and Yau (1996), and Yau and McGilchrist
(1998), focus on recurrent event time data. Often interrecurrence times close
to each other on the time scale are highly associated, while times that are
further apart from each other on the time scale are less correlated. To model
such kinds of serial dependence, a dynamic frailty model can be constructed
by assuming that the frailties of subsequent time intervals follow an autore-
gressive process of order one, denoted by AR(1). Yau and McGilchrist (1998)
adopted AR(1) frailty models to analyze chronic granulomatous disease data.
A similar but simpler idea without using an AR(1) process was suggested by
Wintrebert et al. (2004). A Bayesian approach to the problem can be found
in the paper by Manda and Meyer (2005).
The third approach is based on Lévy processes (Gjessing et al. 2003,
Aalen et al. 2008). Like diffusion processes, hazard functions driven by Lévy
processes also yield some degree of tractability. It is possible to get explicit
formulas for the relationship between conditional and unconditional hazard,
and the frailty and hazard of survivors may be estimated. A basic difference
between this and the first approach is the jump nature of the Lévy processes.
However, it could well be imagined that the individual hazard may increase
in jumps, for example, with the onset of an acute disease. The model is an
extension of Lévy frailty models with fixed frailty in Aalen and Hjort (2002),
discussed in Section 3.10.
Example 7.1
Gamma processes: Let the value of the process at time point u be gamma
distributed with shape parameter ku and scale parameter λ. Then it holds
that
s −ku
L(s; u) = 1 + = e−ku ln(λ+s)−ln λ , k, λ > 0,
λ
with ψ(s) = k ln(λ + s) − ln λ as characteristic exponent (Section 3.6).
Example 7.2
Standard compound Poisson processes: A Poisson process of rate ρ is
running on time scale u, and to each jump there is a gamma random variable,
independent of the past, with shape parameter k and scale parameter λ. The
compound Poisson process is the sum of the gamma variables up to time u.
The Laplace transform of the random value of the process at time u is given
by
k
(λ+s)γ −λγ
L(s; u) = e−u γ , γ < 0.
k
(λ + s)γ − λγ .
Hence, the characteristic exponent is given by ψ(s) = γ
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
234 Frailty Models in Survival Analysis
Figure 7.1: Time-varying frailties of two relatives from family i in two time
intervals in the Paik model.
Example 7.3
Stable processes: The Laplace transform of a stable distributed random
k γ
variable takes the form L(s; u) = e−u γ s . Hence, ψ(s) = γk sγ .
Example 7.4
PVF processes: The power variance function distributions constitute a
general class of distributions. A class of Lévy processes may be defined from
the PVF distributions by ψ(s) = kγ (λ + s)γ − λγ (γ > 0). The special case
γ = 0 is defined by continuity and gives the gamma process. For γ < 0, the
model yields the standard compound Poisson process.
Paik et al. (1994) suggest another extension of the correlated gamma frailty
model. They consider the case of clustered individuals but with varying
frailties over time. For this purpose, the time scale is divided into intervals.
The approach uses a decomposition of the frailty in each interval into a shared
and a unique part, for example, Zijl = Yi + Yijl with independent random
variables Yi ∼ Γ(k1 , λ1 ) and Yijl ∼ Γ(k2 , λ2 ). In the correlated gamma frailty
model, the assumption λ1 = λ2 is essential because it yields the gamma
distribution of Z1 and Z2 . Paik et al. (1994) allow the λ’s to differ which,
on the one hand, complicates the computation of the likelihood function, but
on the other hand, it generates a more flexible model. In both models the
expectation of the frailties is restricted to one. The main feature of the model
in Paik et al. (1994) is to allow the frailty (and, consequently, the dependence
function) to vary over time intervals. Restricting the model to two related
individuals with two recurrent observation times each gives the structure of
frailties shown in Figure 7.1.
Note that Paik et al. (1994) considered a noncompeting risk situation,
which means it is assumed that all lifetimes are observable (with independent
censoring). Different extensions of the Paik model are discussed in Wintrebert
et al. (2004) and Wintrebert (2007).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Different Aspects of Frailty Modeling 235
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
236 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Different Aspects of Frailty Modeling 237
′
S ∗ (t|X1 , X2 ) = 1 − φ(X2 ) + φ(X2 )L(M0 (t)eβ X1
). (7.9)
Peng and Zhang (2008b) establish mild conditions that imply identifiability of
the model. Here, L again denotes the Laplace transform of a frailty variable.
Two kinds of covariates are considered: X2 influences the probability to be
cured, whereas X1 influences the event time in susceptible individuals. The
identifiability of the frailty model is obtained as a special case but with a
much simpler proof compared to the one in Elbers and Ridder (1982). The
above frailty cure model with covariates in the cure fraction and the survival
function of the noncured population is identifiable if the cure fraction φ(X2 ) is
nonconstant and some technical assumptions such a finite mean of the frailty
distribution are fulfilled. It turns out that the case of identical covariates
X1 = X2 needs additional assumptions and is therefore treated separately by
Peng and Zhang (2008b).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
238 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Different Aspects of Frailty Modeling 239
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
240 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Different Aspects of Frailty Modeling 241
7.10.1 R packages
R is a free software environment for statistical computing and graphics. It
runs on a wide variety of UNIX platforms, Windows, and MacOS and has
become more and more attractive to statisticians as well for nonstatisticians.
The software can be assessed on http://cran.R-project.org.
FRAILTYPACK: This package by Rondeau and González (2005) was
mainly extended by Rondeau et al. (2010). It fits semiparametric gamma
frailty models. It also provides parameter estimation in more complex settings
such as the interaction model (7.3) with two dependent Gaussian random
effects. Also, the nested frailty model (7.5) with two gamma-distributed
frailties can be fitted. Furthermore, the joint modeling of recurrent events and
a terminal event by a shared gamma frailty (7.7-7.8) is implemented in the
package. It accommodates left-truncated and right-censored data. Stratified
analysis with two strata is possible. FRAILTYPACK uses the penalization
of the hazard function described by (3.34), applying the robust Marquardt
algorithm, which is a combination between a Newton–Raphson and a steepest-
descent algorithm. As a consequence of this estimation procedure, a smooth
estimate of the hazard function is provided.
COXME: The procedure COXME was created by Therneau and works
with the penalized partial likelihood algorithm. It provides estimation in
the general Cox proportional hazards model with Gaussian random effects
described by model (7.4). Furthermore, COXME is able to handle nested
frailty models (7.5). The package was originally a function in the kinship
package in recognition of the fact that it was primarily targeted toward genetic
problems. COXME was also distributed as part of the base survival package in
S plus. The current version from 2009 is more broad in its capabilities. It turns
out to be useful to modify the convergence criteria eps from 10−6 to 10−9 and
the maximum iteration number for the partial likelihood procedure iter.max
from 20 to 40. This increases the precision of the estimation procedure to a
reasonable size but at the cost of computer time.
SURVIVAL: This package was also written by Therneau for survival data.
One function of the package is COXPH. It fits the proportional hazards
model and is able to deal with time-dependent covariates and strata, multiple
events per subject, and other extensions. One of these extensions include
univariate and shared frailty models. The syntax is similar to that of the
COXME procedure. The frailty distributions gamma, log-t, and log-normal
are supported (Therneau et al. 2003). Similar to the COXME procedure,
modification of the parameters eps and iter.max is recommended.
PHMM: This package steams from Donohue and Ronghui in 2009/2010.
It fits the proportional hazards model incorporating Gaussian random effects
given in (7.4). The estimation procedure is based on the EM algorithm using
Markov Chain Monte Carlo methods in the E step. The software provides no
convergence criterion; the number of iterations has to be fixed.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
242 Frailty Models in Survival Analysis
To make R and SAS procedures comparable, the option ties = ”breslow” for
the handling of ties should be used in all R procedures. Furthermore, despite
the wide range of models covered, the R packages do not provide standard
errors of the random effects variance estimates, which is a major drawback.
An exception is the recent version of FRAILTYPACK.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Appendix A
Appendix
The aim of this appendix is to give some statistical key results that are
important for the field of frailty models. These results are more technical
and, for that reason, moved to the appendix to allow a fluent reading of
the main parts of the book. The next part deals with bivariate event-time
data. The bivariate case is used here to illustrate the main ideas of correlated
frailty models in this simple situation. It turns out that the likelihood of
the survival data can be represented in terms of the survival function and its
partial derivatives. A model specification in terms of the survival function
eases the computational burden of handling censoring and truncation.
Z ∞Z ∞ Z ∞ Z ∞
∗ ∗
= f (t1 , t2 ) g(c1 , c2 ) dc1 dc2 dt∗1 dt∗2
t2 t1 t∗2 t∗
1
243
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
244 Frailty Models in Survival Analysis
Z ∞ Z ∞ Z ∞ Z ∞
= f (t∗1 , t∗2 ) dt∗2 g(c1 , c2 ) dc1 dt∗1 dc2 .
t2 t1 c2 t∗
1
R∞
Because of −St1 (t1 , t2 ) = t2f (t1 , t∗2 ) dt∗2 it holds that
Z ∞
h(t1 , t2 , 1, 0) = Ht1 ,t2 (t1 , t2 , 1, 0) = −St1 (t1 , t2 ) g(c1 , t2 ) dc1 .
t1
Z ∞Z ∞Z ∞ Z ∞
= f (t∗1 , t∗2 ) dt∗1 g(c1 , c2 ) dc2 dc1 dt∗2 .
t2 t1 c1 t∗
2
Hence,
Z ∞
h(t1 , t2 , 0, 1) = Ht1 ,t2 (t1 , t2 , 0, 1) = −St2 (t1 , t2 ) g(t1 , c2 ) dc2 .
t2
Z ∞ Z ∞ Z ∞ Z ∞
= f (t∗1 , t∗2 ) dt∗1 dt∗2 g(c1 , c2 ) dc1 dc2 .
t2 t1 c2 c1
Consequently,
St1 t2 (t1 , t2 )δ1 δ2 St1 (t1 , t2 )δ1 (1−δ2 ) St2 (t1 , t2 )(1−δ1 )δ2 S(t1 , t2 )(1−δ1 )(1−δ2 ) . (A.1)
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Appendix 245
n
Y
Sti1 ti2 (ti1 , ti2 )δi1 δi2 Sti1 (ti1 , ti2 )δi1 (1−δi2 ) Sti2 (ti1 , ti2 )(1−δi1 )δi2 (A.2)
i=1
σ σ ρ
1− σ1 ρ 1− σ2 ρ 2 2 −σ
S(t1 , t2 ) = S1 (t1 ) 2 S2 (t2 ) 1 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) 1 σ2 .
St1 (t1 , t2 ) =
σ1 σ
− 1ρ
σ
1− 2 ρ 2 2 − ρ
− (1 − ρ)S1 (t1 ) σ2 f1 (t1 )S2 (t2 ) σ1 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) σ1 σ2
σ2
σ1 σ
− 1 2 σ2 2 2 ρ
− ρS1 (t1 ) σ2 ρ−σ1 f1 (t1 )S2 (t2 )1− σ1 ρ (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2 −1
σ2
σ σ2
− σ1 ρ 1− ρ 2 2 − σ ρσ −1
= S1 (t1 ) f1 (t1 )S2 (t2 ) σ1 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)
2 1 2
2 σ1 2 σ1
× − S1 (t1 )−σ1 − (1 − ρ)S2 (t2 )−σ2 + (1 − ρ) ,
σ2 σ2
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
246 Frailty Models in Survival Analysis
St2 (t1 , t2 ) =
σ1 σ2 σ2 2 2 ρ
− S1 (t1 )1− σ2 ρ (1 − ρ)S2 (t2 )− σ1 ρ f2 (t2 )(S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2
σ1
σ1 σ2 σ2 2 2 2 ρ
− S1 (t1 )1− σ2 ρ ρS2 (t2 )− σ1 ρ−σ2 f2 (t2 )(S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2 −1
σ1
σ1 σ2 2 2 ρ
= S1 (t1 )1− σ2 ρ S2 (t2 )− σ1 ρ f2 (t2 )(S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2 −1
σ2 2 2 σ2
× − (1 − ρ)S1 (t1 )−σ1 − S2 (t2 )−σ2 + (1 − ρ) ,
σ1 σ1
and finally, the second derivative
St1 t2 (t1 , t2 )
σ1 σ2 σ
− 1ρ
σ
− 2ρ
= (1 − ρ)(1 − ρ)S1 (t1 ) σ2 f1 (t1 )S2 (t2 ) σ1 f2 (t2 )
σ2 σ1
2 2 ρ
−
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) σ1 σ2
σ1 σ2 σ
− 1ρ
σ
− 2 ρ−σ22
+ (1 − ρ) ρS1 (t1 ) σ2 f1 (t1 )S2 (t2 ) σ1 f2 (t2 )
σ2 σ1
2 2 ρ
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2 −1
σ2 σ1 σ
− 1 2 σ2
+ (1 − ρ) ρS1 (t1 ) σ2 ρ−σ1 f1 (t1 )S2 (t2 )− σ1 ρ f2 (t2 )
σ1 σ2
2 2 ρ
−1
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2
σ1 2 σ2 2
+ ρ(ρ + σ1 σ2 )S1 (t1 )− σ2 ρ−σ1 f1 (t1 )S2 (t2 )− σ1 ρ−σ2 f2 (t2 )
2 2 − σ ρσ −2
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) 1 2
σ σ ρ
− σ1 ρ − σ2 ρ 2 2
= S1 (t1 ) 2 f1 (t1 )S2 (t2 ) 1 f2 (t2 )(S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2 −2
σ1 σ2 2 2
× (1 − ρ)(1 − ρ)(S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)2
σ2 σ1
σ1 σ2 2 2 2
+ (1 − ρ) ρS2 (t2 )−σ2 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)
σ2 σ1
σ2 σ1 2 2 2
+ (1 − ρ) ρS1 (t1 )−σ1 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)
σ1 σ2
2 2
+ ρ(ρ + σ1 σ2 )S1 (t1 )−σ1 S2 (t2 )−σ2 .
Plugging these expressions into formula (A.2) gives the likelihood function of
the correlated gamma frailty model. It is easy to see that, even in the simple
bivariate case, the likelihood becomes formidable because of the necessary
derivatives. The problem increases with larger cluster size. Furthermore, the
maximum likelihood method is only applicable in the parametric case with full
specification of the baseline hazard function. Therefore, more sophisticated
strategies for parameter estimation are necessary, especially in the case of
cluster size larger than two.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Appendix 247
The derivatives with respect to the first and second event time can be easily
calculated by
St1 (t1 , t2 ) =
− (1 − ρ)f (t1 )S(t1 )−ρ S(t2 )1−ρ
ρ(1 − γ) γσ 2 1 γσ 2 1 γ
× exp 1 − (1 − ln S(t 1 )) γ + (1 − ln S(t 2 )) γ − 1
γσ 2 1−γ 1−γ
2
γσ 1
− ρf (t1 )S(t1 )−ρ S(t2 )1−ρ (1 − ln S(t1 )) γ −1
1−γ
ρ(1 − γ) γσ 2 1 γσ 2 1
× exp 2
1 − (1 − ln S(t1 )) γ + (1 − ln S(t2 )) γ − 1)γ
γσ 1−γ 1−γ
2 2
γσ 1 γσ 1 γ−1
× (1 − ln S(t1 )) γ + (1 − ln S(t2 )) γ − 1
1−γ 1−γ
and
St2 (t1 , t2 ) =
− (1 − ρ)f (t2 )S(t1 )1−ρ S(t2 )−ρ
ρ(1 − γ) γσ 2 1
γ + (1 −
γσ 2 1
γ − 1
γ
× exp 1 − (1 − ln S(t 1 )) ln S(t 2 ))
γσ 2 1−γ 1−γ
2
γσ 1
− ρf (t2 )S(t1 )1−ρ S(t2 )−ρ (1 − ln S(t2 )) γ −1
1−γ
ρ(1 − γ) γσ 2 1 γσ 2 1 γ
× exp 2
1 − (1 − ln S(t1 )) γ + (1 − ln S(t2 )) γ − 1
γσ 1−γ 1−γ
2 2
γσ 1 γσ 1 γ−1
× (1 − ln S(t1 )) γ + (1 − ln S(t2 )) γ − 1 .
1−γ 1−γ
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
248 Frailty Models in Survival Analysis
St1 t2 (t1 , t2 ) =
(1 − ρ)2 f (t1 )S(t1 )−ρ f (t2 )S(t2 )−ρ
ρ(1 − γ) γσ 2 1
γ + (1 −
γσ 2 1 γ
× exp 1 − (1 − ln S(t 1 )) ln S(t2 )) γ −1
γσ 2 1−γ 1−γ
γσ 2 1
+ ρ(1 − ρ)f (t1 )S(t1 )−ρ f (t2 )S(t2 )−ρ (1 − ln S(t1 )) γ −1
1−γ
ρ(1 − γ) γσ 2 1 γσ 2 1 γ
× exp 2
1 − (1 − ln S(t1 )) γ + (1 − ln S(t2 )) γ −1
γσ 1−γ 1−γ
γσ 2 1 γσ 2 1 γ−1
× (1 − ln S(t1 )) γ + (1 − ln S(t2 )) γ − 1
1−γ 1−γ
γσ 2 1
+ ρ(1 − ρ)f (t1 )S(t1 )−ρ f (t2 )S(t2 )−ρ (1 − ln S(t2 )) γ −1
1−γ
ρ(1 − γ) γσ 2 1 γσ 2 1 γ
× exp 2
1 − (1 − ln S(t 1 )) γ + (1 − ln S(t2 )) γ −1
γσ 1−γ 1−γ
γσ 2 1 γσ 2 1 γ−1
× (1 − ln S(t1 )) γ + (1 − ln S(t2 )) γ − 1
1−γ 1−γ
+ ρ2 f (t1 )S(t1 )−ρ f (t2 )S(t2 )−ρ
ρ(1 − γ) γσ 2 1 γσ 2 1 γ
× exp 2
1 − (1 − ln S(t 1 )) γ + (1 − ln S(t2 )) γ −1
γσ 1−γ 1−γ
γσ 2 1 γσ 2 1 2γ−2
× (1 − ln S(t1 )) γ + (1 − ln S(t2 )) γ − 1
1−γ 1−γ
γσ 2 1 γσ 2 1
× (1 − ln S(t1 )) γ −1 (1 − ln S(t2 )) γ −1
1−γ 1−γ
+ ρσ 2 f (t1 )S(t1 )−ρ f (t2 )S(t2 )−ρ
ρ(1 − γ) γσ 2 1 γσ 2 1 γ
× exp 1 − (1 − ln S(t 1 )) γ + (1 − ln S(t2 )) γ −1
γσ 2 1−γ 1−γ
γσ 2 1 γσ 2 1 γ−2
× (1 − ln S(t1 )) γ + (1 − ln S(t2 )) γ − 1
1−γ 1−γ
γσ 2 1 γσ 2 1
× (1 − ln S(t1 )) γ −1 (1 − ln S(t2 )) γ −1 .
1−γ 1−γ
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Appendix 249
with
I(w2 )
(w1 −m1 )2 w2 −m2
1
Z −1
−w12 M0 (t1 ) −r w1
1−r2 2s2 s1 s2
=√ √ e e 1 dw1
2πs1 1 − r2
(1+2s2 2 2 2
1 (1−r )M0 (t1 ))s2 w1 −2m1 s2 w1 −2r(w2 −m2 )s1 w1 +m1 s2
1
Z
−
2s2 (1−r2 )s2
=√ √ e 1 dw1
2πs1 1 − r2
s 2
m1 +r(w2 −m2 ) s1 m21
√ 1
1+2s21 (1−r 2 )M0 (t1 )
n 1+2s21 (1−r 2 )M0 (t1 )
2
− 1+2s21 (1−r 2 )M0 (t1 )
o
= √ √
2
exp s2 (1−r 2 )
√ 2πs 2
1 1−r
2 1+2s2 (1−r
1
2 )M (t )
1+2s1 (1−r 2 )M0 (t1 ) 1 0 1
s1 s1 2
m +r(w −m ) m +r(w −m 2) s
1 2 2 1 2
Z n w12 − 2 1+2s2 (1−r2 )M (t s2
w 1 + 2 2
2
0 1) 1+2s1 (1−r )M0 (t1 )
o
1
× exp − 2
s1 (1−r )2 dw1
2 1+2s2 (1−r 2 )M (t )
0 1 1
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
250 Frailty Models in Survival Analysis
Because of
s
m1 +r(w2 −m2 ) s1 2
1
Z n w1 − 2
1+2s21 (1−r 2 )M0 (t1 )
o
√ √
2
exp − s21 (1−r 2 )
dw1 = 1,
√ 2πs 1 1−r
2
1+2s21 (1−r 2 )M0 (t1 ) 1+2s21 (1−r 2 )M0 (t1 )
it holds that
2
−2s2 2 2 s1 2 2 s1
1 (1−r )M0 (t1 )m1 +2rm1 (w2 −m2 ) s2 +r (w2 −m2 ) s2
1 2s2 2 2 2
2
I(w2 ) = p e 1 (1−r )(1+2s1 (1−r )M0 (t1 ))
1 + 2s21 (1 − r2 )M0 (t1 )
s s2
2rm1 (w2 −m2 ) 1 +r2 (w2 −m2 )2 1
−M0 (t1 )m2 s2 s2
1 1
1+2s2 (1−r2 )M0 (t1 ) 2s2 (1−r2 )(1+2s2 (1−r2 )M0 (t1 ))
2
= p e 1 e 1 1
1 + 2s21 (1 − r2 )M0 (t1 )
= I1 I2 I3 (w2 ).
Now it is necessary to integrate the conditional survival function with respect
to the second random effect. Considering the foregoing three terms in more
detail, it turns out that the first two terms I1 and I2 are independent of w2 .
The third term I3 (w2 ) contains w2 and has to be included in the integral with
respect to w2 . Consequently, it holds that
S(t1 , t2 )
(w2 −m2 )2 w2 −m2
I1 I2
Z −1
2 +r m1
e−w2 M0 (t2 ) e 1−r 2s2
2 s1 s2
= √ 2 I3 (w2 ) dw2
2πs2
(w2 −m2 )2 w2 −m2
I1 I2
Z −1
2 +r m1
e−w2 M0 (t2 ) e 1−r 2s2
2 s1 s2
= √ 2
2πs2
s s2
2rm1 (w2 −m2 ) 1 +r2 (w2 −m2 )2 1
s2 s2
2
2s2 (1−r2 )(1+2s2 (1−r2 )M0 (t1 ))
×e 1 1 dw2
(w −m2 )2 s2 2 2
1 (1+2s1 (1−r )M0 (t1 ))
I1 I2
Z
2 − 22
e−w2 M0 (t2 ) e 2s1 (1−r2 )s2 2 2
=√ 2 (1+2s1 (1−r )M0 (t1 ))
2πs2
2rm1 s1 s2 (1+2s2 2 2 2 2
1 (1−r )M0 (t1 ))(w2 −m2 )−2rm1 s1 s2 (w2 −m2 )−r (w2 −m2 ) s1
−
2s2 (1−r2 )s2 (1+2s2 (1−r2 )M0 (t1 ))
×e 1 2 1 dw2
(1+2s2 2 2 2 2 2
1 M0 (t1 ))(w2 −m2 ) +2s2 (1+2s1 (1−r )M0 (t1 ))M0 (t2 )w2
I1 I2
Z
−
2s2 2 (1−r2 )M (t ))
=√ e 2 (1+2s 1 0 1
2πs2
4rm1 s1 s2 M0 (t1 )(w2 −m2 )
−
2s2 (1+2s2 (1−r2 )M0 (t1 ))
×e 2 1 dw2
(1+2s2 2 2 2 2
1 M0 (t1 )+2s2 (1+2s1 (1−r )M0 (t1 ))M0 (t2 ))w2
I1 I2
Z
−
2s2 2 2
= √ e 2 (1+2s1 (1−r )M0 (t1 )) dw2
2πs2
−2((1+2s2
1 M0 (t1 ))m2 −2rm1 s1 s2 M0 (t1 ))w2 (1+2s2 2
1 M0 (t1 ))m2 −4rm1 m2 s1 s2 M0 (t1 )
− −
2s2 (1+2s2 (1−r2 )M0 (t1 )) 2s2 (1+2s2 (1−r2 )M0 (t1 ))
×e 2 1 dw2 e 2 1 .
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Appendix 251
(1+2s2 2
1 M0 (t1 ))m2 −4rm1 m2 s1 s2 M0 (t1 )
−
2s2 2 (1−r2 )M (t ))
I4 = e 2 (1+2s 1 0 1 ,
then
S(t1 , t2 )
(1+2s2 2 2 2 2
1 M0 (t1 )+2s2 (1+2s1 (1−r )M0 (t1 ))M0 (t2 ))w2
I1 I2 I4
Z
−
2s2 (1+2s2 (1−r2 )M0 (t1 ))
= √ e 2 1 dw2
2πs2
−2((1+2s2
1 M0 (t1 ))m2 −2rm1 s1 s2 M0 (t1 ))w2
−
2s2 2 2
×e 2 (1+2s1 (1−r )M0 (t1 )) dw2
2((1+2s21 M0 (t1 ))m2 −2rm1 s1 s2 M0 (t1 ))w2
I1 I2 I4
Z n w22 − 1+2s21 M0 (t1 )+2s22 M0 (t2 )+4s21 s22 (1−r 2 )M0 (t1 )M0 (t2 )
o
= √ exp − 2s22 (1+2s21 (1−r 2 )M0 (t1 ))
dw2
2πs2
1+2s21 M0 (t1 )+2s22 M0 (t2 )+4s21 s22 (1−r 2 )M0 (t1 )M0 (t2 )
p
1 + 2s21 (1 − r2 )M0 (t1 )
= I1 I2 I4 p
1 + 2s21 M0 (t1 ) + 2s22 M0 (t2 ) + 4s21 s22 (1 − r2 )M0 (t1 )M0 (t2 )
(1+2s21 M0 (t1 ))m2 −2rm1 s1 s2 M0 (t1 ) 2
n −
1+2s21 M0 (t1 )+2s22 M0 (t2 )+4s21 s22 (1−r 2 )M0 (t1 )M0 (t2 )
o
× exp − 2 2 2
2s (1+2s (1−r )M (t ))
.
2 1 0 1
1+2s21 M0 (t1 )+2s22 M0 (t2 )+4s21 s22 (1−r 2 )M0 (t1 )M0 (t2 )
1
I5 = p ,
1+ 2s21 M0 (t1 ) + 2s22 M0 (t2 ) + 4s21 s22 (1 − r2 )M0 (t1 )M0 (t2 )
S(t1 , t2 )
−((1+2s2 1 M0 (t1 ))m2 −2rm1 s1 s2 M0 (t1 ))
2
−
2s2 (1+2s2 (1−r2 )M0 (t1 ))(1+2s2 M0 (t1 )+2s2 M0 (t2 )+4s2 s2 (1−r2 )M0 (t1 )M0 (t2 ))
= I2 I5 e 2 1 1 2 1 2
((1+2s2 2 2 2 2 2 2
1 M0 (t1 ))m2 −4rm1 m2 s1 s2 M0 (t1 ))(1+2s1 M0 (t1 )+2s2 M0 (t2 )+4s1 s2 (1−r )M0 (t1 )M0 (t2 ))
−
2s2 2 2 2 2 2 2 2
×e 2 (1+2s1 (1−r )M0 (t1 ))(1+2s1 M0 (t1 )+2s2 M0 (t2 )+4s1 s2 (1−r )M0 (t1 )M0 (t2 ))
−2r2 m2 2 2 2 2 2 2
1 s1 M0 (t1 )+m2 M0 (t2 )(1+2s1 M0 (t1 ))(1+2s1 (1−r )M0 (t1 ))
−
(1+2s2 2 2 2 2 2 2
= I2 I5 e 1 (1−r )M0 (t1 ))(1+2s1 M0 (t1 )+2s2 M0 (t2 )+4s1 s2 (1−r )M0 (t1 )M0 (t2 ))
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
252 Frailty Models in Survival Analysis
m2 2
2 M0 (t2 )(1+2s1 M0 (t1 ))−4rm1 m2 s1 s2 M0 (t1 )M0 (t2 )
−
1+2s2 2 2 2 2
×e 1 M0 (t1 )+2s2 M0 (t2 )+4s1 s2 (1−r )M0 (t1 )M0 (t2 )
1
=p
1+ 2s21 M0 (t1 ) + 2s22 M0 (t2 ) + 4s21 s22 (1 − r2 )M0 (t1 )M0 (t2 )
m2 M (t )(1+2s2 2 2
1 M0 (t1 ))−4rm1 m2 s1 s2 M0 (t1 )M0 (t2 )+m1 M0 (t1 )(1+2s2 M0 (t2 ))
− 2 0 2
(1+2s2 2 2 2 2
×e 1 M0 (t1 )+2s2 M0 (t2 )+4s1 s2 (1−r )M0 (t1 )M0 (t2 )) .
In the case of independence r = 0, this results in
−M0 (t1 )m1 2 −M0 (t2 )m2 2
1 1+2s2 M0 (t1 )
1 1+2s2 M0 (t2 )
p e 1 p e 2 .
1 + 2s21 M0 (t1 ) 1 + 2s22 M0 (t2 )
Now we are considering the case of m1 = m2 = 0 and s21 = s22 = s22 , but r 6= 0.
For this special case, it is straightforward to derive the log-likelihood function,
requiring the first and second derivatives of the unconditional survival function
S(t1 , t2 ). Using the relation S(t) = (1 + 2s2 M0 (t))−1/2 , the bivariate survival
function becomes, in the copula form,
− 1
S(t1 , t2 ) = S(t1 )−2 S(t2 )−2 − r2 (S(t1 )−2 − 1)(S(t2 )−2 − 1) 2
− 3
St1 (t1 , t2 ) = − S(t1 )−2 S(t2 )−2 − r2 (S(t1 )−2 − 1)(S(t2 )−2 − 1) 2
× f (t1 )S(t1 )−3 ((1 − r2 )S(t2 )−2 + r2 )
− 3
St2 (t1 , t2 ) = − S(t1 )−2 S(t2 )−2 − r2 (S(t1 )−2 − 1)(S(t2 )−2 − 1) 2
× f (t2 )S(t2 )−3 ((1 − r2 )S(t1 )−2 + r2 ).
For the second derivative, it holds that
St1 t2 (t1 , t2 )
− 5
= 3 S(t1 )−2 S(t2 )−2 − r2 (S(t1 )−2 − 1)(S(t2 )−2 − 1) 2
× f (t1 )S(t1 )−3 f (t2 )S(t2 )−3 ((1 − r2 )S(t1 )−2 + r2 )((1 − r2 )S(t2 )−2 + r2 )
− 3
− 2(1 − r2 ) S(t1 )−2 S(t2 )−2 − r2 (S(t1 )−2 − 1)(S(t2 )−2 − 1) 2
× f (t1 )S(t1 )−3 f (t2 )S(t2 )−3
− 5
= S(t1 )−2 S(t2 )−2 − r2 (S(t1 )−2 − 1)(S(t2 )−2 − 1) 2 f (t1 )S(t1 )−3
× f (t2 )S(t2 )−3 (1 − r2 )S(t1 )−2 + r2 (1 − r2 )S(t2 )−2 + r2 + 2r2 .
This expression can now be used in the bivariate likelihood function (A.2).
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Appendix 253
λ0 λ0 k2
cov(Z1 , Z3 ) = cov( V1 + V2 + V5 , V2 + V6 + V8 ) = V(V2 ) = 2 .
λ1 λ1 λ1
Consequently, the correlation coefficient between Z1 and Z3 is
cov(Z1 , Z3 ) k2
ρ1 = corr(Z1 , Z3 ) = p = = k2 σ12 . (A.3)
V(Z1 )V(Z3 ) λ1
Here, ρ1 is the correlation between the frailties of twins with respect to the first
risk. By similar calculations it is easy to obtain ρ2 = corr(Z2 , Z4 ) = k3 σ22 ,
which describes the correlation between frailties of twin partners regarding the
second competing risk. In the following we calculate the correlation coefficient
ρ = corr(Z1 , Z2 ) = corr(Z3 , Z4 ), describing the correlation between the
frailty terms of the two competing risks. It holds that
λ0 λ0 λ2 k1
cov(Z1 , Z2 ) = cov( V1 + V2 + V5 , V1 + V3 + V4 ) = 0 V(V1 ) = .
λ1 λ2 λ1 λ2 λ1 λ2
The correlation coefficient between Z1 and Z2 (and between Z3 and Z4 ) is
given by
cov(Z1 , Z2 ) k1
ρ = corr(Z1 , Z2 ) = p =√ = k1 σ1 σ2 . (A.4)
V(Z1 )V(Z2 ) λ1 λ2
1 1
Consequently, k1 + k2 + k5 = σ12
, k1 + k3 + k4 = σ22
, and relations (A.3) and
(A.4) result in
1 1 ρ1 ρ
k5 = 2 − k2 − k1 = 2 − 2 −
σ1 σ1 σ1 σ1 σ2
and
1 1 ρ2 ρ
k4 = − k3 − k4 = 2 − 2 − .
σ22 σ2 σ2 σ1 σ2
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
254 Frailty Models in Survival Analysis
S(t1 , y1 , t2 , y2 )
= ES1 (t1 )Z1 S2 (y1 )Z2 S1 (t2 )Z3 S2 (y2 )Z4
λ0 λ0
= Ee−V1 ( λ1 M01 (t1 )+ λ2 M02 (y1 )) e−V2 (M01 (t1 )+M01 (t2 ))
λ λ
−V8 ( λ0 M01 (t2 )+ λ0 M02 (y2 ))
× e−V3 (M02 (y1 )+M02 (y2 )) e 1 2
−V4 M02 (y1 ) −V5 M01 (t1 ) −V6 M01 (t2 ) −V7 M02 (y2 )
×e e e e
M01 (t1 ) M02 (y1 ) −k 1
M01 (t1 ) M01 (t2 ) −k2
= 1+ + 1+ +
λ1 λ2 λ1 λ1
M02 (y1 ) M02 (y2 ) −k3 M01 (t2 ) M02 (y2 ) −k1
× 1+ + 1+ +
λ2 λ2 λ1 λ2
M02 (y1 ) −k4 M01 (t1 ) −k5 M01 (t2 ) −k5 M02 (y2 ) −k4
× 1+ 1+ 1+ 1+
λ2 λ1 λ1 λ2
ρ1 ρ2
2 2 − 2 2 −
= S1 (t1 )−σ1 + S1 (t2 )−σ1 − 1 σ1 S2 (y1 )−σ2 + S2 (y2 )−σ2 − 1 σ2
2 2
2 2 − ρ 2 2 − ρ
× S1 (t1 )−σ1 + S2 (y1 )−σ2 − 1 σ1 σ2 S1 (t2 )−σ1 + S2 (y2 )−σ2 − 1 σ1 σ2
σ σ σ σ
1−ρ2 − σ2 ρ 1−ρ1 − σ1 ρ 1−ρ1 − σ1 ρ 1−ρ2 − σ2 ρ
× S2 (y1 ) 1 S1 (t1 ) 2 S1 (t2 ) 2 S2 (y2 ) 1 .
To derive the likelihood function of the bivariate dependent competing risk
data, it is necessary to keep in mind that the above four-dimensional model
is not fully observable. Only the minimum of the three competing risk times
(tj , yj , cj ) is observable for twin j (j = 1, 2). Because of (5.23), the likelihood
contribution of the truncated data takes the form
1(δ1 = 1, δ2 = 1)St1 ,t2 (t1 , t1 , t2 , t2 )
+ 1(δ1 = 1, δ2 = 0)St1 (t1 , t1 , c2 , c2 )
+ 1(δ1 = 0, δ2 = 1)St2 (c1 , c1 , t2 , t2 )
+ 1(δ1 = 0, δ2 = 0)S(c1 , c1 , c2 , c2 )
+ 1(δ1 = −1, δ2 = −1)Sy1 y2 (y1 , y1 , y2 , y2 )
+ 1(δ1 = −1, δ2 = 0)Sy1 (y1 , y1 , c2 , c2 )
+ 1(δ1 = 0, δ2 = −1)Sy2 (c1 , c1 , y2 , y2 )
+ 1(δ1 = 1, δ2 = −1)St1 y2 (t1 , t1 , y2 , y2 )
+ 1(δ1 = −1, δ2 = 1)Sy1 t2 (y1 , y1 , t2 , t2 ) /S(t+ , t+ , t+ , t+ ),
where t+ denotes the truncation time. In the following the derivatives of the
survival function needed for the likelihood expression are calculated.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Appendix 255
St1 (t1 , y1 , t2 , y2 )
ρ1 ρ2
2 2 − −1 2 2 −
= −ρ1 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)
2
σ1 2
σ2
2 2 ρ 2 2 ρ
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)− σ1 σ2
σ1 2 σ1 σ2 σ2
× S1 (t1 )1−ρ1 − σ2 ρ−σ1 µ1 (t1 )S1 (y1 )1−ρ1 − σ2 ρ S2 (t2 )1−ρ2 − σ1 ρ S2 (y2 )1−ρ2 − σ1 ρ
ρ ρ
σ1 2 2 − 12 2 2 − 22
− ρ(S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ1 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) σ2
σ2
2 2 − σ ρσ −1 2 2 −σ ρ
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) 1 2 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1) 1 σ2
σ σ σ σ
1−ρ1 − σ1 ρ−σ12 1−ρ1 − σ1 ρ 1−ρ2 − σ2 ρ 1−ρ2 − σ2 ρ
× S1 (t1 ) 2 µ1 (t1 )S1 (y1 ) 2 S2 (t2 ) 1 S2 (y2 ) 1
ρ
σ1 2 2 − 12
− (1 − ρ1 − ρ)σ12 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ1
σ2
ρ2
2 2 − 2 2 ρ
× (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2
2
σ2
2 2 ρ
× (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)− σ1 σ2
σ1 σ1 σ2 σ2
ρ 1−ρ − ρ 1−ρ − ρ 1−ρ − ρ
× S1 (t1 )1−ρ1 − σ2 µ1 (t1 )S1 (y1 ) 1 σ2 S2 (t2 ) 2 σ1 S2 (y2 ) 2 σ1
Sy1 (t1 , y1 , t2 , y2 )
ρ1 ρ2
2 2 − −1 2 2 −
= −ρ1 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ2
1 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) σ2
2
2 2 − σ ρσ 2 2 −σ ρ
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) 1 2 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1) 1 σ2
σ1 σ1 2 σ2 σ2
× S1 (t1 )1−ρ1 − σ2 ρ S1 (y1 )1−ρ1 − σ2 ρ−σ1 µ1 (y1 )S2 (t2 )1−ρ2 − σ1 ρ S2 (y2 )1−ρ2 − σ1 ρ
ρ ρ
σ1 2 2 − 12 2 2 − 22
− ρ(S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ1 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) σ2
σ2
2 2 ρ 2 2 ρ
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)− σ1 σ2 −1
σ σ σ σ
1−ρ1 − σ1 ρ 1−ρ1 − σ1 ρ−σ12 1−ρ2 − σ2 ρ 1−ρ2 − σ2 ρ
× S1 (t1 ) 2 S1 (y1 ) 2 µ1 (y1 )S2 (t2 ) 1 S2 (y2 ) 1
ρ
σ1 2 2 − 12
− (1 − ρ1 − ρ)σ12 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ
1
σ2
ρ2
2 2 − 2 2 − σ ρσ
× (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) σ2
2 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) 1 2
−σ12 −σ22 − σ ρσ
× (S1 (y1 ) + S2 (y2 ) − 1) 1 2
σ σ σ σ
1−ρ1 − σ1 ρ 1−ρ1 − σ1 ρ 1−ρ2 − σ2 ρ 1−ρ2 − σ2 ρ
× S1 (t1 ) 2 S1 (y1 ) 2 µ1 (y1 )S2 (t2 ) 1 S2 (y2 ) 1
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
256 Frailty Models in Survival Analysis
St2 (t1 , y1 , t2 , y2 )
ρ1 ρ2
2 2 − 2 2 − −1
= −ρ2 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ2
1 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) σ2
2
ρ
2 2 − σ ρσ 2 2 −σ
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) 1 2 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1) 1 σ2
σ1 σ1 σ2 2 σ2
× S1 (t1 )1−ρ1 − σ2 ρ S1 (y1 )1−ρ1 − σ2 ρ S2 (t2 )1−ρ2 − σ1 ρ−σ2 µ2 (t2 )S2 (y2 )1−ρ2 − σ1 ρ
ρ ρ
σ2 2 2 − 12 2 2 − 22
− ρ(S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ1 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) σ2
σ1
2 2 ρ 2 2 ρ
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2 −1 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)− σ1 σ2
σ1 σ1 σ2 2 σ2
× S1 (t1 )1−ρ1 − σ2 ρ S1 (y1 )1−ρ1 − σ2 ρ S2 (t2 )1−ρ2 − σ1 ρ−σ2 µ2 (t2 )S2 (y2 )1−ρ2 − σ1 ρ
ρ
σ2 2 2 − 12
− (1 − ρ2 − ρ)σ22 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ1
σ1
ρ2
2 2 − 2 2 − σ ρσ
× (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) σ2
2 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) 1 2
−σ12 −σ22 − σ ρσ
× (S1 (y1 ) + S2 (y2 ) − 1) 1 2
σ σ σ2 σ2
1−ρ1 − σ1 ρ 1−ρ1 − σ1 ρ
× S1 (t1 ) 2 S1 (y1 ) 2 S2 (t2 )1−ρ2 − σ1 ρ µ2 (t2 )S2 (y2 )1−ρ2 − σ1 ρ
Sy2 (t1 , y1 , t2 , y2 )
ρ1 ρ2
2 2 − 2 2 − −1
= −ρ2 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)
2
σ1 2
σ2
2 2 ρ 2 2 ρ
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)− σ1 σ2
σ1 σ1 σ2 σ2 2
1−ρ −
× S1 (t1 ) 1 σ2 ρ S1 (y1 )1−ρ1 − σ2 ρ S2 (t2 )1−ρ2 − σ1 ρ S2 (y2 )1−ρ2 − σ1 ρ−σ2 µ2 (y2 )
ρ ρ
σ1 2 2 − 12 2 2 − 22
− ρ(S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ1 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) σ2
σ2
ρ
2 2 − σ ρσ 2 2 −σ −1
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1) 1 2 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1) 1 σ2
σ1 σ1 σ2 σ2 2
× S1 (t1 )1−ρ1 − σ2 ρ S1 (y1 )1−ρ1 − σ2 ρ S2 (t2 )1−ρ2 − σ1 ρ S2 (y2 )1−ρ2 − σ1 ρ−σ2 µ2 (y2 )
ρ
σ2 2 2 − 12
− (1 − ρ2 − ρ)σ22 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1) σ1
σ1
ρ2
2 2 − 2 2 ρ
× (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1) (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)− σ1 σ2
2
σ2
2 2 − σ ρσ
× (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1) 1 2
σ σ σ2 σ2
1−ρ1 − σ1 ρ 1−ρ1 − σ1 ρ
× S1 (t1 ) 2 S1 (y1 ) 2 S2 (t2 )1−ρ2 − σ1 ρ S2 (y2 )1−ρ2 − σ1 ρ µ2 (y2 )
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Appendix
σ2 σ2
σ1 2 2 2 2 2 2
+ ρρ1 S1 (t1 )−σ1 S1 (y1 )−σ1 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ2
σ1 σ1 2 2 2 2 2
+ ρ(1 − ρ1 − ρ) S1 (t1 )−σ1 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)2 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ2 σ2
2 2 2 2 2 2
+ ρ1 (ρ1 + σ12 )S1 (t1 )−σ1 S1 (y1 )−σ1 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ1 2 2 2
+ ρ1 (1 − ρ1 − ρ)S1 (t1 )−σ1 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)
σ2
2 2 2 2
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ1 2 2 2
+ ρ1 (1 − ρ1 − ρ)S1 (y1 )−σ1 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)
σ2
2 2 2 2
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ1 2 2 2 2 2 2
i
+ (1 − ρ1 − ρ)2 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)2 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ2
257
© 2011 by Taylor and Francis Group, LLC
www.Ebook777.com
Free ebooks ==> www.Ebook777.com
St1 ,y2 (t1 , y1 , t2 , y2 )
258
2 2 2 2 2 2
= µ1 (t1 )µ2 (y2 )(S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)−ρ1 /σ1 −1 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)−ρ2 /σ2 −1
2 2 2 2
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)−ρ/σ1 σ2 −1 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)−ρ/σ1 σ2 −1
σ1 σ1 σ2 σ2
1−ρ − ρ 1−ρ − ρ 1−ρ − ρ 1−ρ − ρ
× S1 (t1 ) 1 σ2 S1 (y1 ) 1 σ2 S2 (t2 ) 2 σ1 S2 (y2 ) 2 σ1
h σ1 2 2 2 2 2 2
× ρρ2 S1 (t1 )−σ1 S2 (y2 )−σ2 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ2
2 2 2 2 2 2
+ ρ1 ρ2 S1 (t1 )−σ1 S2 (y2 )−σ2 (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
Appendix
2 2 2 2 2 2
+ ρ2 S2 (t2 )−σ2 S1 (y1 )−σ1 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)(S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)
σ2 σ1 2 2 2
+ ρ(1 − ρ2 − ρ) S1 (y1 )−σ1 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)
σ1 σ2
2 2 2 2
× (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)(S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)
σ2 2 2 2 2 2 2
+ ρρ1 S2 (t2 )−σ2 S1 (y1 )−σ1 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ1
σ1 σ2 2 2 2
+ ρ(1 − ρ1 − ρ) S2 (t2 )−σ2 (S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)
σ2 σ1
2 2 2 2
× (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ2 2 2 2 2 2 2 2
+ ρ1 (1 − ρ2 − ρ)S1 (y1 )−σ1 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)(S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
σ1
σ1 σ2 2 2 2 2
+ (1 − ρ1 − ρ)(1 − ρ2 − ρ)(S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)(S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)
σ2 σ1
259
2 2 2 2
i
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)(S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)
260
St2 ,y2 (t1 , y1 , t2 , y2 )
2 2 2 2 2 2
= µ2 (t2 )µ2 (y2 )(S1 (t1 )−σ1 + S1 (y1 )−σ1 − 1)−ρ1 /σ1 S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)−ρ2 /σ2 −2
2 2 2 2
× (S1 (t1 )−σ1 + S2 (t2 )−σ2 − 1)−ρ/σ1 σ2 −1 (S1 (y1 )−σ1 + S2 (y2 )−σ2 − 1)−ρ/σ1 σ2 −1
σ1 σ1 σ2 σ2
1−ρ − ρ 1−ρ − ρ 1−ρ − ρ 1−ρ − ρ
× S1 (t1 ) 1 σ2 S1 (y1 ) 1 σ2 S2 (t2 ) 2 σ1 S2 (y2 ) 2 σ1
h σ2 2 2 2 2
× ρ2 22 S2 (t2 )−σ2 S2 (y2 )−σ2 (S2 (t2 )−σ2 + S2 (y2 )−σ2 − 1)2
σ1
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
262 Frailty Models in Survival Analysis
The additive decomposition of the frailty variance and the correlation between
co-twins’ frailty hold that
1 = a2 + d2 + i2 + c2 + e2 (A.6)
ρ = ρ1 a2 + ρ2 d2 + ρ3 i2 + ρ4 c2 + ρ5 e2 , (A.7)
2 2 2 2 2
where lowercase letters a , d , i , c , e indicate the proportions of the total
variance σ 2 associated with the correspondent components of frailty, and ρi
(i = 1, ..., 5) are correlations between respective components within a twin
pair. In this case, broad sense heritability can be expressed as
H 2 = a2 + d2 + i2 ,
where the term a2 denotes small sense heritability. Standard assumptions
of quantitative genetics models specify different values of ρi (i = 1, ..., 5) for
monozygotic and dizygotic twins. In the case of monozygotic twins ρi = 1
(i = 1, ..., 4) and ρ5 = 0, while for dizygotic twins ρ1 = 0.5, ρ2 = 0.25,
ρ3 = m, ρ4 = 1, ρ5 = 0, and 0 ≤ m ≤ 0.25 is an unknown parameter.
Not all parameters of the genetic decomposition of frailty can be estimated
simultaneously, even under the assumption of no epistasis (i2 = 0). In this
case, it is only possible to conclude that the true heritability H 2 is in the
interval (Iachine 2002)
4
(ρMZ − ρDZ ) ≤ H 2 ≤ min{ρMZ , 2(ρMZ − ρDZ )}.
3
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
Appendix 263
The width of the interval is at most ρMZ −ρDZ , which represents the potential
error in heritability estimation using only the correlation coefficients of MZ
and DZ twins. The model in fact reduces to three equations (two relations
(A.7) for monozygotic and dizygotic twins each, and one constraint (A.6)),
allowing estimation of no more than three parameters at the same time. One
possibility is to consider an ACE (additive genetic – common environment –
uncommon environment) model. In this case, equations (A.6) and (A.7) lead
to
1 = a 2 + c2 + e 2
ρMZ = a2 + c2 (A.8)
ρDZ = 0.5a2 + c2 .
This system can be integrated into the correlated frailty model giving place
to a reparameterization of the original model. The only difference is that, if
we are interested in estimating the parameters of a genetic model, data for
monozygotic and dizygotic twins have to be analyzed simultaneously and a
likelihood function for combined data has to be drawn. Equally, other genetic
models can be obtained combining no more than three components of frailty.
It is necessary to note that heritability estimation requires assumptions
that are often difficult to verify in practice. For example, the trait must be
represented as an additive combination of uncorrelated environmental and
genetic factors, and the variances of phenotypic traits associated with related
individuals must be the same. The classical twin method is based on the
assumption that MZ and DZ twins have the same correlation with respect
to environmental factors (equal environment assumption). This assumption
is necessary for the identifiability of heritability, that is, so as to be able to
interpret the difference in concordance between MZ and DZ twins as being
explained in full by their difference in genetic concordance. However, without
doubt, the assumption is also sometimes questionable: MZ twins are generally
treated the same by their parents to a much greater extent than DZ twins.
This implies an overestimation of heritability, especially for behavioral traits.
This does not decrease the statistical attractiveness of this direction of
research. However, the interpretation of heritability estimates must be used
with care as pointed out by Feldman and Lewontin (1975).
After the age of six, death rates for Danish twins born between 1870
and 1900 are almost the same as those for the same cohorts of the Danish
population. The distributions of age at death for monozygotic twins are close
to those of dizygotic twins for both sexes (Christensen et al. 1995). Recent
papers dealing with twin cohorts born during the period 1870 – 1930 found
similar mortality patterns for Danish twins and the general Danish population
with respect to CHD (Wienke et al. 2001, Christensen et al. 2001). This
similarity suggests that it is possible to generalize genetic results from survival
analysis of twins to the total population with respect to mortality due to CHD.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
www.Ebook777.com
Free ebooks ==> www.Ebook777.com
References
265
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
266 Frailty Models in Survival Analysis
Andersen, P.K., Klein, J.P., Zhang, M.-J. (1999) Testing for centre effects
in multi-centre survival studies: a Monte Carlo comparison of fixed and
random effects tests. Statistics in Medicine 18, 1489–1500.
Anderson, J.E., Louis, T.A. (1995) Survival analysis using a scale change
random effects model. Journal of the American Statistical Association
90, 669–679.
Anderson, J.E., Louis, T.A., Holm, N.V., Harvald, B. (1992) Time dependent
association measures for bivariate survival distributions. Journal of the
American Statistical Association 87, 641–650.
Banerjee, S., Wall, M.M., Carlin, B.P. (2003) Frailty modeling for spatially
correlated survival data, with application to infant mortality in Min-
nesota. Biostatistics 4, 123–142.
Barker, P., Henderson, R. (2005) Small sample bias in the gamma frailty
model for univariate survival. Lifetime Data Analysis 11, 265–284.
Beard, R.E. (1959) Note on some mathematical mortality models. In: The
Lifespan of Animals. G.E.W. Wolstenholme, M.O’Conner (eds.), Ciba
Foundation Colloquium on Ageing, Little, Brown, Boston, 302–311.
Bellamy, S., Li, Y., Ryan, L., Lipsitz, S., Canner, M., Wright, R. (2004)
Analysis of clustered and interval censored data from a community-
based study in asthma. Statistics in Medicine 23, 3607–3621.
Berkson, J., Gage, R. (1952) Survival curve for cancer patients following
treatment. Journal of the American Statistical Association 47, 501–515.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 267
Blumen, I., Kogan, M., McCarthy, P.J. (1955) The Industrial Mobility of
Labor as a Probability Process. Cornell University Press, New York.
Blossfeld, H.P., Hamerle, A. (1989) Unobserved heterogeneity in hazard rate
models – a test and an illustration from a study of career mobility.
Quality & Quantity 23, 129–141.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
268 Frailty Models in Survival Analysis
Broe̋t, P., Moreau, T., Lellouch, J., Asselain, B. (1999) Unobserved covari-
ates in the two-sample comparison of survival times: a maxmin efficiency
robust test. Statistics in Medicine 18, 1791–1800.
Cai, J., Prentice, R.L. (1995) Estimating equations for hazard ratio para-
meters based on correlated failure time data. Biometrika 82, 151–164.
Carvalho, M., Henderson, R., Shimakura, S., Sousa, I.P.S.C. (2003) Survival
of hemodialysis patients: modeling differences in risk of dialysis centers.
International Journal for Quality in Health Care 15, 189–196.
Cederlöf, R., Friberg, L., Jonsson, E., Kaij, L. (1961) Studies on similarity
diagnosis in twins with the aid of mailed questionnaires. Acta Genetica
11, 338–362.
Chang, I.-S., Wen, C.-C., Wu, Y.-J. (2007) A profile likelihood theory for
the correlated gamma-frailty model with current status family data.
Statistica Sinica 17, 1023–1046.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 269
Christensen, K., Vaupel, J., Holm, N., Yashin, A. (1995) Mortality among
twins after age 6: fetal origins hypothesis versus twin method. British
Medical Journal 310, 432–436.
Christensen, K., Wienke, A., Skytthe, A., Holm, N., Vaupel, J., Yashin, A.
(2001) Cardiovascular mortality in twins and fetal origins hypothesis.
Twin Research 4, 344–349.
Chuang, S.K., Cai, T., Douglass, C.W., Wei, L.J., Dodson, T.B. (2005)
Frailty approach for the analysis of clustered failure time observations
in dental research. Journal of Dental Research 84, 54–58.
Claeskens, G., Nguti, R., Janssen, P. (2008) One-sided tests in shared frailty
models. Test 17, 69–82.
Clayton, D.G. (1978) A model for association in bivariate life tables and
its application in epidemiological studies of familial tendency in chronic
disease incidence. Biometrika 65, 141–151.
Clayton, D., Cuzick, J. (1985a) The semi-parametric Pareto model for re-
gression analysis of survival times. Proceedings of the 45th Session of
the International Statistical Institute 23, 1–18.
Commenges, D., Andersen P.K. (1995) Score test of homogeneity for survival
data. Lifetime Data Analysis 1, 145–160.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
270 Frailty Models in Survival Analysis
Cook, R., Ng, E., Mukherjee, J., Vaughan, D. (1999) Two-state mixed re-
newal processes for chronic disease. Statistics in Medicine 18, 175–188.
Cook, R.J., Lawless, J.F. (2007) The Statistical Analysis of Recurrent Events.
Springer, Berlin.
Cox, D.R. (1959) Analysis of exponentially distributed life-times with two
types of failure. Journal of the Royal Statistical Society (B) 21, 411–421.
Cox, D.R. (1972) Regression models and life-tables. Journal of the Royal
Statistical Society (B) 34, 187–220.
Cox, D.R. (1975) Partial likelihood. Biometrika 62, 269–276.
Cox, D.R., Oakes, D. (1984) Analysis of Survival Data. Chapman & Hall,
London.
Crowder, M. (1989) A multivariate distribution with Weibull connections.
Journal of the Royal Statistical Society (B) 51, 93–107.
Cui, S., Sun, Y. (2004) Checking for the gamma frailty distribution under
the marginal proportional hazards frailty model. Statistica Sinica 14,
249–267.
Davis, H.T., Feldstein, M. (1979) The generalized Pareto law as a model for
progressively censored survival data. Biometrika 66, 299–306.
de Faire, U., Friberg, L., Lundman, T. (1975) Concordance for mortality
with special reference to ischaemic heart and cerebrovascular disease: a
study on the Swedish Twin Registry. Preventive Medicine 4, 509–517.
Dempster, A.P., Laird, N.M., Rubin, D.B. (1977) Maximum likelihood from
incomplete data via the EM algorithm. Journal of the Royal Statistical
Society (B) 39, 1–38.
Diamond, I.D., McDonald, J.W., Shah, I.H. (1986) Proportional hazards
models for current status data - application to the study of differentials
in age at weaning in Pakistan. Demography 23, 607–620.
di Serio, C. (1997) The protective impact of a covariate on competing failures
with an example from bone marrow transplantation. Lifetime Data
Analysis 3, 99–122.
Dominicus, A., Skrondal, A., Gjessing, H., Pedersen, N, Palmgren, J. (2006)
Likelihood ratio tests in behavioral genetics: problems and solutions.
Behavior Genetics 36, 331–340.
dos Santos, D., Davies, R, Francis, B. (1995) Nonparametric hazard ver-
sus nonparametric frailty distribution in modelling recurrence of breast
cancer. Journal of Statistical Planning and Inference 47, 111–127.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 271
Drapeau, M.D., Gassa, E.K., Simisona, M.D., Muellera, L.D., Rosea, M.R.
(2000) Testing the heterogeneity theory of late-life mortality plateaus
by using cohorts of Drosophila melanogaster. Experimental Gerontology
35, 71–84.
Drzewiecki, K.T., Ladefoged, C., Christensen, H.E. (1980a) Biopsy and prog-
nosis for cutaneous malignant melanomas in clinical stage I. Scandina-
vian Journal of Plastic and Reconstructive Surgery and Hand Surgery
14, 141–144.
Drzewiecki, K.T., Christensen, H.E., Ladefoged, C., Poulsen, H. (1980b)
Clinical course of cutaneous malignant melanoma related to histopatho-
logical criteria of primary tumour. Scandinavian Journal of Plastic and
Reconstructive Surgery and Hand Surgery 14, 229–234.
Duchateau, L., Janssen, P., Lindsey, P., Legrand, C., Nguti, R., Sylvester, R.
(2002) The shared frailty model and the power for heterogeneity tests in
multicenter trials. Computational Statistics & Data Analysis 40, 603–20.
Duchateau, L., Janssen, P., Kezic, I., Fortpied, C. (2003) Evolution of recur-
rent asthma event rate over time in frailty models. Journal of the Royal
Statistical Society (B) 52, 355–363.
Duchateau, L., Janssen, P. (2004) Penalized partial likelihood for frailties
and smoothing splines in time to first insemination models for dairy
cows. Biometrics 60, 608–614.
Duchateau, L., Janssen, P. (2008) The Frailty Model. Springer, New York.
Dunson, D.B., Chen, Z. (2004) Selecting factors predictive of heterogeneity
in multivariate event time data. Biometrics 60, 352–358.
Economou, P., Caroni, C. (2005) Graphical tests for the assumption of
gamma and inverse Gaussian frailty distributions. Lifetime Data Ana-
lysis 11, 565–582.
Economou, P., Caroni, C. (2008) Graphical tests for the frailty distribution
in the shared frailty model. Communications in Statistics - Simulations
and Computation 37, 978–992.
Efron, B. (1967) The two sample problem with censored data. Proceedings of
the 5th Berkeley Symposium on Mathematical Statistics and Probability,
pp. 831–853. University California Press
Efron, B. (1977) The efficiency of Cox’s likelihood function for censored data.
Journal of the American Statistical Association 72, 557–565.
Elbers, C., Ridder, G. (1982) True and spurious duration dependence: the
identifiability of the proportional hazard model. Review of Economic
Studies XLIX, 403–409.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
272 Frailty Models in Survival Analysis
Ellermann, R., Sullo, P., Tien, J.M. (1992) An alternative approach to model-
ing recidivism using quantile residual life functions. Operations Research
40, 485–504.
Farewell, V.T. (1977) A model for a binary variable with time-censored ob-
servations. Biometrika 64, 43–46.
Farewell, V.T., Math, B., Math, M. (1977) The combined effect of breast
cancer risk factors. Cancer 40, 931–936.
Farewell, V.T. (1982) The use of mixture models for the analysis of survival
data with long-term survivors. Biometrics 38, 1041–1046.
Farrington, C., Kanaan, M., Gay, N. (2001) Estimation of the basic reproduc-
tion number for infectious diseases from age-stratified serological survey
data. Applied Statistics 50, 251–292.
Feuer, E.J., Wun, L.-M., Boring, C.C., Flanders, W.D., Timmel, M.J., Tong,
T. (1993) The lifetime risk of developing breast cancer. Journal of the
National Cancer Institute 85, 892–897.
Fine, J., Glidden, D., Lee, K. (2003) A simple estimator for a shared frailty
regression model. Journal of the Royal Statistical Society (B) 65, 317–29.
Finkelstein, M. (2008) Failure Rate Modeling for Reliability and Risk. Springer,
New York.
Flinn, C., Heckman, J. (1982) New methods for analyzing structural models
of labor force dynamics. Journal of Econometrics 18, 115–168.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 273
Gatz, M., Pedersen, N.L., Crowe, M., Fiske, A. (2000) Defining discordance
in twin studies of risk and protective factors for late life disorders. Twin
Research 3, 159–164.
Gelman, A., Rubin, D.B. (1992) Inference from iterative simulation using
multiple sequences. Statistical Science 7, 457–511.
Giard, N., Lichtenstein, P., Yashin, A. (2002) A multistate model for genetic
analysis of the ageing process. Statistics in Medicine 21, 2511–2526.
Gilks, W.R., Richardson, S., Spiegelhalter, D.G. (1996) Markov Chain Monte
Carlo in Practice. Chapman and Hall, London.
Gjessing, H.K., Aalen, O.O., Hjort, N.L. (2003) Frailty models based on
Lévy processes. Advances in Applied Probability 35, 532–550.
Glidden, D.V. (1999) Checking the adequacy of the gamma frailty model for
multivariate failure times. Biometrika 86, 381–393.
Goeman, J.J., Le Cessie, S., Baatenburg de Jong, R.J., van de Geer, S.A.
(2004) Predicting survival using disease history: a model combining
relative survival and frailty. Statistica Neerlandica 58, 21–34.
Goethals, K., Janssen, P., Duchateau, L. (2008) Frailty models and copulas:
similarities and differences. Journal of Applied Statistics 35, 1071–1079.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
274 Frailty Models in Survival Analysis
Gorfine, M., Zucker, D., Hsu, L. (2006) Prospective survival analysis with
a general semiparametric shared frailty model: a pseudo full likelihood
approach. Biometrika 93, 735–741.
Gray, R.J. (1995) Tests for variation over groups in survival data. Journal
of the American Statistical Association 90, 198–203.
Greenwood, M., Yule, G.U. (1920) An inquiry into the nature of frequency
distributions representative of multiple happenings with particular ref-
erence to the occurrence of multiple attacks of disease or of repeated
accidents. Journal of the Royal Statistical Society 83, 255–279.
Gupta, R. C., Gupta, R.D. (2009) General frailty model and stochastic or-
derings. Journal of Statistical Planning and Inference 139, 3277–3287.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 275
Hastings, W.K. (1970) Monte Carlo sampling methods using Markov Chains
and their applications. Biometrika 57, 97–109.
Harvald, B., Hauge, M. (1970) Coronary occlusion in twins. Acta Geneticae
Medicae et Gemellologiae 19, 248–250.
Hauge, M. (1981) The Danish twin register. In: Prospective Longitudinal
Research. S. Mednich, A. Baert, B. Bachmann (eds.), Oxford Medical
Publisher, Oxford, pp. 217–222.
Haukka, J., Suvisaari, J., Lönnqvist, J. (2003) Increasing age does not de-
crease risk of schizophrenia up to age 40. Schizophrenia Research 61,
105–110.
Heath, A., Madden, P. (1995) Genetic influences on smoking behavior. In:
Behavior: Genetic Approaches in Behavioral Medicine. J.R. Turner,
L.R. Cardon, J.K. Hewitt (eds.), Plenum Press, New York, 45–63.
Heckman, J.J., Honoré B.E. (1989) The identifiability of the competing risks
model. Biometrika 76, 325–330.
Heckman, J.J., Singer, B. (1982a) The identification problem in econometric
models for duration data. In: Advances in Econometrics. W. Hilden-
brandt (ed.), Cambridge University Press, Cambridge, pp. 39–77.
Heckman, J.J., Singer, B. (1982b) Population heterogeneity in demographic
models. In: Multidimensional Mathematical Demography. K. Land, A.
Rogers (eds.), Academic Press, New York.
Heckman, J.J., Singer, B. (1984a) A method for minimizing the impact
of distributional assumptions in econometric models for duration data.
Econometrica 52, 271–320.
Heckman, J.J., Singer, B. (1984b) The identifiability of the proportional
hazard model. Review of Economic Studies 51, 231–241.
Heckman, J.J., Walker J.R. (1990) Estimating fecundability from data on
waiting-times to 1st conception. Journal of the American Statistical
Association 85, 283–294.
Heckman, J.J. (1991) Identifying the hand of the past: distinguishing state
dependence from heterogeneity. American Economic Review 81, 71–79.
Heckman, J.J., Taber, C.R. (1994) Econometric mixture models and more
general models for unobservables in duration analysis. Statistical Methods
in Medical Research 3, 279–302.
Henderson, R., Oman, P. (1999) Effect of frailty on marginal regression esti-
mates in survival analysis. Journal of the Royal Statistical Society (B)
61, 367–379.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
276 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 277
Hougaard, P., Harvald, B., Holm, N.V. (1992) Measuring the similarities
between the lifetimes of adult Danish twins born 1881 – 1930. Journal
of the American Statistical Association 87, 17–24.
Hougaard, P. (1995) Frailty models for survival data. Lifetime Data Analysis
1, 255–273.
Huang, X., Wolfe, R.A. (2002) A frailty model for informative censoring.
Biometrics 58, 510–520.
Huang, X., Wolfe, R.A., Hu, C. (2004) A test for informative censoring in
clustered survival data. Statistics in Medicine 23, 2089–2107.
Huber-Carol, C., Vonta, I. (2004) Frailty models for arbitrarily censored and
truncated data. Lifetime Data Analysis 10, 369–388.
Iachine, I.A. (2002) The Use of Twin and Family Survival Data in the
Population Studies of Aging: Statistical Methods Based on Multivariate
Survival Models. Ph.D. Thesis. Monograph 8, Department of Statistics
and Demography, University of Southern Denmark.
Iachine, I., Holm, N., Harris, J., Begun, A., Iachina, M., Laitinen, M.,
Kaprio, J., Yashin, A. (1998) How heritable is individual susceptibility
to death? The results of an analysis of survival data on Danish, Swedish
and Finnish twins. Twin Research 1, 196–205.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
278 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 279
Keiding, N., Andersen, P., Klein, J. (1997) The role of frailty models and ac-
celerated failure time models in describing heterogeneity due to omitted
covariates. Statistics in Medicine 16, 215–224.
Keiding, N. (1998) Selection effects and nonproportional hazards in survival
models and models for repeated events. In: Proceedings of the 19th
International Biometric Conference, Cape Town, pp. 241–250.
Keiding, N., Andersen, P. (eds.) (2006) Survival and Event History Analysis.
Wiley, New York.
Khazaeli, A., Xiu, L., Curtsinger, J.W. (1995) Stress experiments as a means
of investigating age-specific mortality in Drosophila melanogaster.
Experimental Gerontology 30, 177–184.
Kheiri, S., Meshkani, M.R., Faghihzadeh, S. (2005) A correlated frailty model
for analysing risk factors in bilateral corneal graft rejection for Kerato-
conus: a Bayesian approach. Statistics in Medicine 24, 2681–2693.
Kheiri, S., Kimber, A., Meshkani, M.R. (2007) Bayesian analysis of an inverse
Gaussian correlated frailty model. Computational Statistics and Data
Analysis 51, 5317–5326.
Kimber, A.C. (1996) A Weibull-based score test for heterogeneity. Lifetime
Data Analysis 2, 63–71.
Kimber, A.C., Zhu, C.Q. (1999) Diagnostics for a Weibull frailty model.
In: Statistical Inference and Design of Experiments. U.J. Dixit, M.R.
Satam (eds.) Narosa Publishing House, New Delhi, pp. 36–46.
Klein, J.P., Moeschberger, M.L. (2003) Survival Analysis - Techniques for
Censored and Truncated Data. Springer, New York.
Klein, J.P. (1992) Semiparametric estimation of random effects using the
Cox model based on the EM algorithm. Biometrics 48, 795–806.
Klein, J.P., Moeschberger, M., Li, Y., Wang, S. (1992) Estimating random
effects in the Framingham Heart Study. In: Survival Analysis: State of
the Art. J. Klein, P. Goel (eds.), Kluwer, Dordrecht, pp. 99–120.
Klein, J.P., Pelz, C., Zhang, M.-J. (1999) Random effects for censored data
by a multivariate normal regression model. Biometrics 55, 497–506.
Komárek, A., Lesaffre, E., Legrand, C. (2007) Baseline and treatment effect
heterogeneity for survival times between centers using a random effects
accelerated failure time model with flexible error distribution. Statistics
in Medicine 26, 5457–5472.
Kondo, K. (1977) The log-normal distribution of the incubation time of
exogenous diseases. Japanese Journal of Human Genetics 21, 217–237.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
280 Frailty Models in Survival Analysis
Korsgaard, I.R., Andersen, A.H. (1998) The additive genetic gamma frailty
model. Scandinavian Journal of Statistics 25, 255–270.
Korsgaard, I.R., Madsen, P., Jensen, J. (1998) Bayesian inference in the
semiparametric log normal frailty model using Gibbs sampling.
Genetics, Selection, Evolution 30, 241–256.
Kortram, R. A., Lenstra, A. J., Ridder, G., van Rooij, A. C. M. (1995)
Constructive identification of the mixed proportional hazards model.
Statistica Neerlandica 49, 269–281.
Kosorok, M.R., Lee, B.L., Fine, J.P. (2004) Semiparametric inference for
proportional hazards frailty regression models. Annals of Statistics 32,
1448–1491.
Koziol, J.A., Green, S.B. (1976) A Cramer–von Mises statistic for randomly
censored data. Biometrika 63, 465–474.
Kuk, A.Y.C., Chen, C.-H. (1992) A mixture model combining logistic re-
gression with proportional hazards regression. Biometrika 79, 531–541.
Kuß, O., Blankenburg, T., Haerting, J. (2008) A relative survival model for
clustered responses. Biometrical Journal 50, 408–418.
Lam, K., Kuk, Y. (1997) A marginal approach to estimation in frailty models.
Journal of the American Statistical Association 92, 985–990.
Lam, K.F., Lee, Y.W., Leung, T.L. (2002) Modeling multivariate survival
data by a semiparametric random effects proportional odds model.
Biometrics 58, 316–323.
Lam, K., Lee, Y. (2004) Merits of modelling multivariate survival data using
random effects proportional odds model.Biometrical Journal 46, 331–42.
Lam, K., Fong, D., Tang, O. (2005) Estimating the proportion of cured
patients in a censored sample. Statistics in Medicine 24, 1865–1879.
Lambert, P., Collett, D., Kimber, A., Johnson, R. (2004) Parametric accel-
erated failure time models with random effects and an application to
kidney transplant survival. Statistics in Medicine 23, 3177–3192.
Lancaster, T. (1979) Econometric methods for the duration of unemploy-
ment. Econometrica 47, 939–956.
Lancaster, T., Nickell, S. (1980) The analysis of re-employment probabilities
for the unemployed. Journal of the Royal Statistical Society (A) 143,
141 –165
Lawless, J.F. (2002) Statistical Models and Methods for Lifetime Data. Wiley,
New York.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 281
Lee, S., Lee, S. (2003) Testing heterogeneity for frailty distribution in shared
frailty model. Communications in Statistics, Theory and Methods 32,
2245–2253.
Lee, E.W., Wei, L.J., Amato, D.A. (1992) Cox-type regression analysis for
large numbers of small groups of correlated failure time observations.
In: Survival Analysis: State of the Art. J. Klein, P. Goel (eds.), Kluwer
Academic Publishers, Dordrecht, pp. 237–247.
Lee, S., Wolfe, R.A. (1998) A simple test for independent censoring under
the proportional hazards model. Biometrics 54, 1176–1182.
Legrand, C., Ducrocq, V., Janssen, P., Sylvester, R., Duchateau, L. (2005)
A Bayesian approach to jointly estimate centre and treatment by centre
heterogeneity in a proportional hazards model. Statistics in Medicine
24, 3789–3804.
Legrand, C., Duchateau, L., Sylvester, R., Janssen, P., van der Hage, J.,
van der Velde, C.J.H., Therasse, P. (2006) Heterogeneity in disease free
survival between centers: lessions learned from an EORTC breast cancer
trial. Clinical Trials 3, 10–18.
Legrand, C., Duchateau, L., Janssen, P., Ducrocq, V., Sylvester, R. (2008)
Validation of prognostic indices using the frailty model. Lifetime Data
Analysis 15, 59–78.
Li, H. (1999) The additive genetic gamma frailty model for linkage analysis
of age-of-onset variation. Annals of Human Genetics 63, 455–468.
Li, H. (2002) An additive genetic gamma frailty model for linkage analysis
of diseases with variable age of onset using nuclear families. Lifetime
Data Analysis 8, 315–334.
Li, C.-S., Taylor, J.M.G., Sy, J.P. (2001) Identifiability of cure models.
Statistics & Probability Letters 54, 389–395.
Li, Y., Betensky, R., Louis, D., Cairncross, J. (2002) The use of frailty hazard
models for unrecognized heterogeneity that interacts with treatment:
considerations of efficiency and power. Biometrics 58, 232–236.
Liang, K.-Y., Self, S., Bandeen-Roche, K., Zeger, S. (1995) Some recent
developments for regression analysis of multivariate failure time data.
Lifetime Data Analysis 1, 403–415.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
282 Frailty Models in Survival Analysis
Lichtenstein, P., de Faire, U., Floderus, B., Svartengren, M., Svedberg, P.,
Pedersen, N.L. (2002) The Swedish Twin Registry: a unique resource
for clinical, epidemiological and genetic studies. Journal of Internal
Medicine 252, 184–205.
Lillard, L.A., Brian, M.J., Waite, M.J. (1995) Premarital cohabitation and
subsequent marital dissolution: a matter of self-selection? Demography
32, 437–457.
Lillard, L.A., Panis, C. (2000) aML User’s Guide and Reference Manual.
Econ-Ware, Los Angeles, CA.
Lim, H.J., Liu, J., Melzer-Lange, M. (2007) Comparison of methods for ana-
lyzing recurrent events data: application to the emergency department
visits of firearm victims. Accident Analysis and Prevention 39, 290–299.
Lin, D.Y. (1994) Cox regression analysis of multivariate failure time data:
the marginal approach. Statistics in Medicine 13, 2233–2247.
Lin, D., Robins, J., Wei, L. (1996) Comparing two failure time distributions
in the presence of depending censoring. Biometrika 83, 381–393.
Lindeboom, M., Van Den Berg, G.J. (1994) Heterogeneity in models for
bivariate survival: the importance of the mixing distribution. Journal
of the Royal Statistical Society (B) 56, 49–60.
Link, W.A. (1989) A model for informative censoring. Journal of the Ameri-
can Statistical Association 84, 749–752.
Liu, L., Wolfe, R.A., Huang, X. (2004) Shared frailty models for recurrent
events and a terminal event. Biometrics 60, 747–756.
Liu, L., Huang, X. (2008) The use of Gaussian quadrature for estimation in
frailty proportional hazard models. Statistics in Medicine 27, 2665–83.
Locatelli, I. (2003) Frailty models for twins duration data: a Bayesian ap-
proach. Ph.D. thesis, University Luigi Bocconi, Milano.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 283
Locatelli, I., Lichtenstein, P., Yashin, A.I. (2004) The heritability of breast
cancer: a Bayesian correlated frailty model applied to Swedish twins
data. Twin Research 7, 182–191.
Longini, I.M., Halloran, M.E. (1996) A frailty mixture model for estimating
vaccine efficacy. Applied Statistics 45, 165–173.
Ma, R., Krewski, D., Burnett, R.T. (2003) Random effects Cox models: a
Poisson modelling approach. Biometrika 90, 157–169.
Mallick, M., Ravishanker, N. (2006) PVF frailty models with a flexible base-
line hazard. International Journal of Statistics and Systems 1, 57–80.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
284 Frailty Models in Survival Analysis
Manda, S.O.M., Meyer, R. (2005) Bayesian inference for recurrent event data
using time-dependent frailty. Statistics in Medicine 24, 1263–1274.
Manton, K., Stallard, E., Vaupel, J. (1981) Methods for comparing mortality
experience of heterogeneous populations. Demography 18, 389–410.
Manton, K., Stallard, E., Vaupel, J. (1986) Alternative models for hetero-
geneity of mortality risks among the aged. Journal of the American
Statistical Association 81, 635–644.
Manton, K.G., Vaupel, J.W. (1995) Survival after the age of 80 in the United
States, Sweden, France, England, and Japan. New England Journal of
Medicine 333, 1232–1235.
Marenberg, M.E., Risch, N., Berkman, L.F., Floderus, B., de Faire, U. (1994)
Genetic susceptibility to death from coronary heart disease in a study
of twins. New England Journal of Medicine 330, 1041–1046.
Martinussen, T., Pipper, C.B. (2005) Estimation in the positive stable shared
frailty proportional hazards model. Lifetime Data Analysis 11, 99–115.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 285
McGilchrist, C.A. (1993) REML estimation for survival models with frailty.
Biometrics 49, 221–225.
McGilchrist, C., Yau, K. (1996) Survival analysis with time dependent frailty
using a longitudinal model. Australian Journal of Statistics 38, 53–60.
McGue, M., Vaupel, J.W., Holm, N., Harvald, B. (1993) Longevity is mo-
derately heritable in a sample of Danish twins born 1870-1880. Journal
of Gerontology: Biological Sciences B 48, B237–B244.
Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller A., Teller, E. (1953)
Equations of state calculations by fast computing machines. Journal of
Chemical Physics 21, 1087–1091.
Miller, R.G. (1981) Survival Analysis. Wiley & Sons, New York.
Moger, T.A., Aalen, O.O., Halvorsen, T.O., Storm, H.H., Tretli, S. (2004a)
Frailty modelling of testicular cancer incidence using Scandinavian data.
Biostatistics 5, 1–14.
Moger, T.A., Aalen, O.O., Heimdal, K., Gjessing, H.K. (2004b) Analysis of
testicular cancer data using a frailty model with familial dependence.
Statistics in Medicine 23, 617–632.
Moger, T.A., Aalen, O.O. (2005) A distribution for multivariate frailty based
on the compound Poisson distribution with random scale. Lifetime Data
Analysis 11, 41–59.
Morgan, M.V., Adams, G.G., Campain, A.C., Wright, F.A.C. (2005) Assess-
ing sealant retention using a Poisson frailty model. Community Dental
Health 22, 237–245.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
286 Frailty Models in Survival Analysis
Mueller, L.D., Drapeau, M.D., Adams, C.S., Hammerle, C.W., Doyal, K.M.,
Jazayeri, A.J., Ly, T., Beguwala, S.A., Mamidi, A.R., Rose, M.R. (2003)
Statistical tests of demographic heterogeneity theories. Experimental
Gerontology 38, 373–386.
Murphy, S.A. (1995) Asymptotic theory for the frailty model. Annals of
Statistics 23, 182–198.
Murphy, S.A., Rossini, A., van der Vaart, A.W. (1997) Maximum likelihood
estimation in the proportional odds model. Journal of the American
Statistical Association 92, 968–976.
Murthy, D.N.P., Xie, M., Jiang, R. (2003) Weibull Models. Wiley, New York.
Naylor, J., Smith, A. (1982) Applications of a method for the efficient com-
putation of posterior distribution. Applied Statistics 31, 214–225.
Neal, R. (1997) Markov chain Monte Carlo methods based on ‘slicing’ the
density function. Technical Report 9722, Department of Statistics,
University of Toronto, Canada.
Neale, M.C., Cardon, L.R. (1992) Methodology for Genetic Studies of Twins
and Families. Kluwer, Dordrecht.
Nelson, K.P., Lipsitz, S.R., Fitzmaurice, G.M., Ibrahim, J., Parzen, M.,
Strawderman, R. (2006) Use of the probability integral transformation
to fit nonlinear mixed-effects models with nonnormal random effects.
Journal of Computational and Graphical Statistics 15, 39–57.
Nielsen, G.G., Gill, R.D., Andersen, P.K., Sørensen, T.I.A. (1992) A counting
process approach to maximum likelihood estimation in frailty models.
Scandinavian Journal of Statistics 19, 25–43.
Noh, M., Ha, I.D., Lee, Y. (2006) Dispersion frailty models and HGLMs.
Statistics in Medicine 25, 1341–1354.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 287
Oakes, D., Jeong, J.-H. (1998) Frailty models and rank tests. Lifetime Data
Analysis 4, 209–228.
Pan, W. (2001) Using frailties in the accelerated failure time model. Lifetime
Data Analysis 7, 55–64.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
288 Frailty Models in Survival Analysis
Petersen, L., Sørensen, T.I.A., Andersen, P.K. (2010) A shared frailty model
for case-cohort samples: Parent and offspring relations in an adoption
study. Statistics in Medicine 29, 924–931.
Peto, R., Lee, P.N., Paige, W.S. (1972) Statistical analysis of the bioassay
of continuous carcinogens. British Journal of Cancer 26, 258–261.
Pickles, A., Crouchley, R., Simonoff, E., Eaves, L., Meyer, J., Rutter, M.,
Hewitt, J., Silberg, J. (1994) Survival models for developmental genetic
data: age of onset of puberty and antisocial behavior in twins. Genetic
Epidemiology 11, 155–170.
Prentice, R.L., Williams, B.J., Peterson, A.V. (1981) On the regression anal-
ysis of multivariate failure time data. Biometrika 68, 373–379.
Price, D.L., Manatunga, A.K. (2001) Modelling survival data with a cured
fraction using frailty models. Statistics in Medicine 20, 1515–1527.
Qiou, Z., Ravishanker, N., Dey, D. (1999) Multivariate survival analysis with
positive stable frailties. Biometrics 55, 637–644.
Ries, L.A.G., Kosary, C.L., Hankey, B.F. (eds.) (1999) SEER Cancer Statis-
tics Review 1973-1999. National Cancer Institute, Bethesda, MD.
Rocha, C.S. (1996) Survival models for heterogeneity using the non-central
chi-squared distribution with zero degrees of freedom. In: Lifetime
Data: Models in Reliability and Survival Analysis. Jewell, N. et al.
(eds.) Kluwer Academic Publishers, Dordrecht, pp. 275–279.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 289
Rondeau, V., Filleul, L., Joly, P. (2006) Nested frailty model using maximum
penalized likelihood estimation. Statistics in Medicine 25, 4036–4052.
Rondeau, V., Michiels, S., Liquet, B., Pignon, J.P. (2008) Investigating trial
and treatment heterogeneity in an individual patient data meta-analysis
of survival data by means of the penalized maximum likelihood ap-
proach. Statistics in Medicine 27, 1894–1910.
Rosenthal, T.C., Puck, S.M. (1999) Screening for genetic risk of breast can-
cer. American Family Physician 59, 99–104.
Sahu, S.K., Dey, D.K., Aslanidou, H., Sinha, D. (1997) A Weibull regression
model with gamma frailties for multivariate survival data. Lifetime Data
Analysis 3, 123–137.
Sahu, S.K., Dey, D.K. (2004) On a Bayesian multivariate survival model with
skewed frailty. In: Skew-Elliptical Distributions and Their Applications:
A Journey Beyond Normality. M. Genton (ed.), Chapman & Hall/CRC,
Boca Raton, FL, pp. 321–338.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
290 Frailty Models in Survival Analysis
Sankaran, P.G., Gleeja, V.L. (2008) Proportional reversed hazard and frailty
models. Metrika 68, 333–342.
Sartwell, P.E. (1966) The incubation period and the dynamics of infectious
disease. American Journal of Epidemiology 83, 204–216.
Sastry, N. (1997) A nested frailty model for survival data, with an applica-
tion to the study of child survival in northeast Brazil. Journal of the
American Statistical Association 92, 426–435.
Schnier, C., Hielm, S., Saloniemi, H.S. (2004) Comparison of the breeding
performance of cows in cold and warm loose-housing systems in Finland.
Preventive Veterinary Medicine 62, 135–151.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 291
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
292 Frailty Models in Survival Analysis
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 293
Vaida, F., Xu, R. (2000) Proportional hazards model with random effects.
Statistics in Medicine 19, 3309–3324.
van den Berg, G.J. (1992) Nonparametric tests for unobserved heterogeneity
in duration models. Working Paper, Free University of Amsterdam,
Amsterdam.
van den Berg, G.J. (2001) Duration models: specification, identification,
and multiple durations. In: Handbook of Econometrics. (Volume V)
J.J. Heckman, E. Leamer (eds.) North-Holland, Amsterdam.
van den Berg, G.J., Doblhammer-Reiter, G., Christensen, K. (2008) Being
born under adverse economic conditions leads to a higher cardiovascular
mortality rate later in life – evidence based on individuals born at dif-
ferent stages of the business cycle. Discussion paper 3635. Forschungs-
institut zur Zukunft der Arbeit.
Vaupel, J.W., Carey, J.R., Christensen, K., Johnson, T.E., Yashin, A.I.,
Holm, N.V., Iachine, I.A., Kannisto, V., Khazaeli, A.A., Liedo, P.,
Longo, V.D., Zeng, Y., Manton, K.G., Curtsinger, J.W. (1998). Biode-
mographic trajectories of longevity. Science 280, 855–860.
Vaupel, J., Manton, K., Stallard, E. (1979) The impact of heterogeneity in
individual frailty on the dynamics of mortality. Demography 16, 439–54.
Vaupel, J.W., Yashin, A.I. (1985) Heterogeneity’s ruses: some surprising
effects of selection on population dynamics. The American Statistician
39, 176–185.
Verweij, P.J., van Houwelingen, H.C., Stijnen, T. (1998) A goodness-of-fit
test for Cox’s proportional hazards model based on martingale residuals.
Biometrics 54, 1517–1526.
Viswanathan, B., Manatunga, A. (2001) Diagnostic plots for assessing the
frailty distribution in multivariate survival data. Lifetime Data Analysis
7, 143–155.
von Bortkiewicz, L. (1898) Das Gesetz der kleinen Zahlen. Teubner, Leipzig.
Vu, H.T.V., Knuiman, M.W. (2002) A hybrid ML-EM algorithm for calcula-
tion of maximum likelihood estimates in semiparametric shared frailty
models. Computational Statistics and Data Analysis 40, 173–187.
Vu, H.T.V. (2003) Parametric and semiparametric conditional shared gamma
frailty models with events before study entry. Communications in Statis-
tics, Simulation and Computation 32, 1223–1248.
Vu, H.T.V. (2004) Estimation in semiparametric conditional shared frailty
models with events before study entry. Computational Statistics and
Data Analysis 45, 621–637.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
294 Frailty Models in Survival Analysis
Warwick, J., Tabar, L., Vitak, B., Duffy, S.W. (2004) Time-dependent effects
on survival in breast carcinoma. Cancer 100, 1331–1336.
Wienke, A., Holm, N., Skytthe, A., Yashin, A.I. (2001) The heritability of
mortality due to heart diseases: a correlated frailty model applied to
Danish twins. Twin Research 4, 266–274.
Wienke, A., Christensen, K., Skytthe, A., Yashin, A.I. (2002) Genetic anal-
ysis of cause of death in a mixture model with bivariate lifetime data.
Statistical Modelling 2, 89–102.
Wienke, A., Lichtenstein, P., Yashin, A.I. (2003a) A bivariate frailty model
with a cure fraction for modeling familial correlations in diseases.
Biometrics 59, 1178–1183.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 295
Wienke, A., Holm, N., Christensen, K., Skytthe, A., Vaupel, J., Yashin,
A.I. (2003b) The heritability of cause-specific mortality: a correlated
gamma-frailty model applied to mortality due to respiratory diseases in
Danish twins born 1870 – 1930. Statistics in Medicine 22, 3873–3887.
Wienke, A., Herskind, A., Christensen, K., Skytthe, A., Yashin, A. (2005a)
The heritability of CHD mortality in Danish twins after controlling for
smoking and BMI. Twin Research and Human Genetics 8, 53–59.
Wienke, A., Arbeev, K., Locatelli, I., Yashin, A.I. (2005b) A comparison
of different correlated frailty models and estimation strategies.
Mathematical Biosciences 198, 1–13.
Wienke, A., Locatelli, I., Yashin, I. (2006a) The modelling of a cure fraction
in bivariate time-to-event data. Austrian Journal of Statistics 35, 67–76.
Wienke, A., Lichtenstein, P., Czene, K., Yashin, A.I. (2006b) The role of
correlated frailty models in studies of human health, ageing and longe-
vity. In: Applications to Cancer and AIDS Studies, Genome Sequence
Analysis, and Survival Analysis. N. Balakrishnan, J. Auget, M. Mesbah,
G. Molenberghs (eds.), Birkhäuser, Boston, pp. 151–166.
Wienke, A., Ripatti, S., Palmgren, J., Yashin, A.I. (2010) A bivariate survival
model with compound Poisson frailty. Statistics in Medicine 29, 275–83.
Wong, M.C.M., Lam, K.F., Lo, E.C.M. (2006) Multilevel modelling of clus-
tered grouped survival data using Cox regression model: an application
to ART dental restorations. Statistics in Medicine 25, 447–457.
Wu, D., Rea, S.L., Yashin, A.I., Johnson, T.E. (2006) Visualizing hidden het-
erogeneity in isogenic populations of C. elegans. Experimental Geron-
tology 41, 261–270.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
296 Frailty Models in Survival Analysis
Xu, L., Zhang, J. (2010) An EM-like algorithm for the semiparametric ac-
celerated failure time gamma frailty model. Computational Statistics
and Data Analysis 54, 1467–1474.
Xue, X., Brookmeyer, R. (1996) Bivariate frailty model for the analysis of
multivariate survival time. Lifetime Data Analysis 2, 277–290.
Xue, X., Ding, Y. (1999) Assessing heterogeneity and correlation of paired
failure times with the bivariate frailty model. Statistics in Medicine 18,
907–918.
Yamaguchi, T., Ohashi, Y. (1999) Investigating centre effects in a multi-
centre clinical trial of superficial bladder cancer. Statistics in Medicine
18, 1961–1971
Yamaguchi, T., Ohashi, Y., Matsuyama, Y. (2002) Proportional hazard mod-
els with random effects to examine centre effects in multicentre cancer
clinical trials. Statistical Methods in Medical Research 11, 221–236.
Yashin, A.I., Begun, A., Iachine, I.A. (1999a) Genetic factors in susceptibility
to death: a comparative analysis of bivariate survival models. Journal
of Epidemiology and Biostatistics 4, 53–60.
Yashin, A.I., Herskind, A.-M., Begun, A.Z., Iachine, I.A. (1999b) Survival
models for genetic analysis with random effects depending on observed
covariates. Unpublished draft.
Yashin, A.I., Iachine, I.A. (1995a) Genetic analysis of durations: Correlated
frailty model applied to survival of Danish twins. Genetic Epidemiology
12, 529–538.
Yashin, A.I., Iachine, I.A. (1995b) Survival of related individuals: an
extension of some fundamental results of heterogeneity analysis.
Mathematical Population Studies 5, 321–39.
Yashin, A.I., Iachine, I.A. (1996) Random effect models of bivariate survival:
quadratic hazard as a new alternative. In: Transactions of Symposium
i Anvendt Statistik. G. Kristensen (ed.), Institute of Economy, Odense
University, pp. 87–101.
Yashin, A.I., Iachine, I.A. (1997) How frailty models can be used for evaluat-
ing longevity limits: taking advantage of an interdisciplinary approach.
Demography 34, 31–48.
Yashin, A.I., Iachine, I. (1999a) Dependent hazards in multivariate survival
problems. Journal of Multivariate Analysis 71, 241–261.
Yashin, A.I., Iachine, I. (1999b) What difference does the dependence between
durations make? Insights for population studies of aging. Lifetime Data
Analysis 5, 5–22.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
References 297
Yashin, A.I., Manton, K.G. (1997) Effects of unobserved and partially ob-
served covariate processes on system failure: a review of models and
estimation strategies. Statistical Science 12, 20–34.
Yashin, A.I., Manton, K.G., Iachine, I.A. (1996) Genetic and environmental
factors in duration studies: multivariate frailty models and estimation
strategies. Journal of Epidemiology and Biostatistics 1, 115–120.
Yashin, A.I., Vaupel, J.W., Iachine, I.A. (1993) Correlated individual frailty:
an advantageous approach to survival analysis of bivariate data.
Working Paper: Population Studies of Aging 7, Odense University.
Yashin, A.I., Vaupel, J.W., Iachine, I.A. (1995) Correlated individual frailty:
an advantageous approach to survival analysis of bivariate data.
Mathematical Population Studies 5, 145–159.
Yau, K., McGilchrist, C. (1997) Use of generalised linear mixed models for
the analysis of clustered survival data. Biometrical Journal 39, 3–11.
Yau, K.K.W. (2001) Multilevel models for survival analysis with random
effects. Biometrics 57, 96–102.
Yin, G., Ibrahim, J.G. (2005) A class of Bayesian shared gamma frailty
models with multivariate failure time data. Biometrics 61, 208–216.
Yu, B., Peng, Y. (2008) Mixture cure models for multivariate survival data.
Computational Statistics and Data Analysis 52, 1524–1532.
Zahl, P. (1997) Frailty modelling for the excess hazard. Statistics in Medicine
16, 1573–1585.
Zdravkovic, S., Wienke, A., Pedersen, N.L., Marenberg, M.E., Yashin, A.I.,
de Faire, U. (2002) Heritability of death from coronary heart disease: a
36 years follow-up of 20,966 Swedish twins. Journal of Internal Medicine
252, 247–254.
Zdravkovic, S., Wienke, A., Pedersen, N.L., Marenberg, M.E., Yashin, A.I.,
de Faire, U. (2004) Genetic influences on CHD-death and the impact of
known risk factors: comparison of two frailty models. Behavior Genetics
34, 585–591.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC
Free ebooks ==> www.Ebook777.com
298 Frailty Models in Survival Analysis
Zhang, J., Peng, Y. (2007) An alternative estimation method for the ac-
celerated failure time frailty model. Computational Statistics and Data
Analysis 51, 4413–4423.
Zheng, M., Klein, J.P. (1995) Estimates of marginal survival for dependent
competing risks based on assumed copula. Biometrika 82, 127–138.
Zhong, X., Li, H. (2004) Score tests of genetic association in the presence of
linkage based on the additive genetic gamma frailty model. Biostatistics
5, 307–327.
www.Ebook777.com
© 2011 by Taylor and Francis Group, LLC