0% found this document useful (0 votes)
4 views10 pages

Prognosis Research and Risk of Bias

The document discusses the growing interest in prognosis research and highlights the unsatisfactory quality of published studies due to incomplete methodologies. It identifies seven major domains for risk of bias in prognosis research and emphasizes the importance of proper study design, participant selection, and transparent reporting to improve the reliability and applicability of prognostic models. The authors advocate for rigorous validation and assessment of the impact of prognostic tools on clinical practice to enhance their utility in patient management.

Uploaded by

caiocpo9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views10 pages

Prognosis Research and Risk of Bias

The document discusses the growing interest in prognosis research and highlights the unsatisfactory quality of published studies due to incomplete methodologies. It identifies seven major domains for risk of bias in prognosis research and emphasizes the importance of proper study design, participant selection, and transparent reporting to improve the reliability and applicability of prognostic models. The authors advocate for rigorous validation and assessment of the impact of prognostic tools on clinical practice to enhance their utility in patient management.

Uploaded by

caiocpo9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Intern Emerg Med (2016) 11:251–260

DOI 10.1007/s11739-016-1404-z

CE - COCHRANE’S CORNER

Prognosis research and risk of bias


Gennaro D’Amico1 • Giuseppe Malizia1 • Mario D’Amico2

Received: 25 December 2015 / Accepted: 28 January 2016 / Published online: 24 February 2016
Ó SIMI 2016

Abstract The interest in prognosis research has been Keywords Prognosis research  Prognostic model 
steadily growing during the past few decades because of its Validation study  Impact analysis  Prediction rule 
impact on clinical decision making. However, since the Prognostic indicator
methodology of prognosis research is still incompletely
defined, the quality of published prognosis studies is largely
unsatisfactory. Seven major domain for risk of bias in Introduction
prognosis research have been identified, including study
participation, attrition, selection of candidate predictors, Prognosis refers to the probability of a particular outcome
outcome definition, confounding factors, analysis, and over time in an individual patient, and is essential for
interpretation of results. The methodology for performing clinical reasoning. It is, in fact, prognostic information that
prognostic studies is currently aimed at avoiding such allows to understanding the importance of diagnosis, and
potential biases. Amongst methodologic requirements in the relevance of treatment.
prognosis research, the following should be considered most Outcome prediction is based on individual clinical or non-
relevant: beforehand publication of the study protocol clinical characteristics, and on their combination with
including the full statistical plan; inclusion of patients at a specific disease characteristics. Outcomes are often major
similar point along the course of the disease; rationale and clinical events such as death or disease complications, but
biological plausibility of candidate predictors; complete they may also be measures of disease progression, or quality
information; control of overfitting and underfitting; adequate of life. The patients’ characteristics associated with the
data handling and analysis; publication of the original data. outcome of interest are termed prognostic indicators, or
Validation and analysis of the impact that prediction models predictors, and may be classified as causal factors or pre-
have on patient management, are key steps for translation of dictive factors. A causal factor is a potential cause of the
prognosis research into clinical practice. Finally, transparent outcome of interest, whereas a predictive factor is associated
reporting of prognostic studies is essential for assessing with the outcome without being a cause of the outcome.
reliability, applicability and generalizability of study results, Therefore, all causal factors are predictive, but not all pre-
and recommendations are now available for this aim. dictive factors are causal. The combination of several pre-
dictors in a prediction model may improve prediction
accuracy.
Prognosis research is aimed at assessing outcome
probabilities and relevant predictors. The methodology of
& Gennaro D’Amico prognosis research is less settled than the methodology of
[email protected]
diagnosis research or treatment research, and several
1
Gastroenterology Unit, V Cervello Hospital, Ospedale V. reviews have shown that the quality of published prognosis
Cervello, Via Trabucco 180, Palermo, Italy research is not satisfactory [1, 2]. The major requirements
2
Radiology Section, DIBIMED. University of Palermo, Via of prognosis research are the inclusion, at a well recog-
del Vespro 129, 90127 Palermo, Italy nizable point along the disease course, of patients

123
252 Intern Emerg Med (2016) 11:251–260

Fig. 1 Number of articles


reporting on prognosis studies
published between 2000 and
2011, retrieved in PubMed
using the search terms
Prognosis or Prognostic in the
title

representative of the target population, clear and repro- 41

40
39
37
ducible definitions of predictors and outcome, complete-

Cumulave number of studies


ness of information and follow-up, appropriate analysis,

30
29
validation of results in independent patient samples, and
23
assessment of the impact of prognostic information on the 20
individual patient management.
An impressive number of articles reporting prognosis 13
10

studies is published every year in medical journals (Fig. 1).


However, few prognostic scores are validated, and still 3
0

fewer have a convincing impact on clinical practice [3]. As


1980 1990 2000 2010
an example, 41 prognostic scores for mortality in cirrhosis
Year of publicaon
have been published since 1980 (Fig. 2), although only the
Child-Pugh [4] and the MELD [5] scores are currently used Fig. 2 Cumulative number of prognostic scores for prediction of the
in clinical practice. risk of death in cirrhosis, published after 1980
The wide variability in outcome rate across published
prognosis studies of the same clinical condition may be, at Study design and bias
least in part, explained by random variation. However,
imperfect methodology and inadequate reporting are likely Major potential areas of bias for prognosis studies have
amongst the major determinants of prognosis studies been identified [1], and should be thoroughly accounted for
variability, and hence of their poor applicability [1]: in a when planning prognosis studies: (1) participation: when
systematic review of prognosis studies in cirrhosis [6] only the study sample is not representative of the population of
17 of 118 studies included an inception cohort, and only 20 interest on key characteristics; (2) attrition: when the loss
reported on missing information. Validity of suggested of follow-up or missing information hamper a reliable
prognostic indicators or prognostic scores and their incre- assessment of outcome probability or of the relationship
mental prognostic efficiency (with respect to the already between key characteristics (or candidate prognostic indi-
known indicators) is rarely tested. Agreement on the defi- cators) and the outcome of interest; (3) prognostic factor
nition of the predicted outcomes is frequently insufficient, measurement: if prognostic factors are not adequately
and subjectivity may become a major source of variability. measured; (4) outcome measurement: when the outcome of
Finally, the clinical impact of prognosis research is rarely interest is not adequately measured, (5) confounding
assessed making it of questionable value. There is therefore measurement and interpretation, when important potential
a great need of good quality and well reported prognosis confounders are not appropriately accounted for; (6) anal-
research. ysis: if the statistical analysis is not appropriate to the study
In this article, the major methodological requirements to design; (7) interpretation and reporting if the results are
avoid biases in performing and interpreting prognosis interpreted wrongly or with bias, or the reporting is
studies are summarized (Box 1). insufficient [7]. The methodology of prognosis research has

123
Intern Emerg Med (2016) 11:251–260 253

Box 1 Major methodological issues in prognosis research


Item Comment

Study design Prospective cohort studies allow optimal measurement of baseline characteristics, predictors and outcome.
In retrospective studies information is frequently incomplete.
Case–control studies do not allow for estimation of absolute risks if not nested in a cohort of known size
Study population Should be representative of the population to whom the study results are intended to be applied in clinical practice.
Selective inclusion results in biased conclusions.
Inclusion of consecutive patients is the best prevention of selection bias
Time zero All patients should be included at a similar point along the course of the disease
Complete Missing information is a frequent source of bias. A prospective study design is the best way to prevent missing information
information
Candidate Should include those already known and new candidates based on clinically or biologically plausible hypotheses.
predictors The number of included variables should be no more than 1 every 10 observed outcomes to avoid overfitting
Outcome Should be clinically relevant, clearly defined and easily reproducible
Analysis The analysis protocol should be a priori set. Logistic and proportional hazards models are most frequently used.
Univariable analysis is not recommended as the sole variable reduction method. Clinical judgement and biological
plausibility should also guide variable selection
Performance Discrimination and calibration of the proposed prognostic model should be assessed in an independent patient sample
Impact analysis The impact of new prognostic tools in clinical practice should be prospectively assessed in terms of cost-benefit
Interpretation Interpretation should disclose any potential limiting factor of the study, address any overoptimistic evaluation of results,
and account for any potential confounder. The incremental value of the new proposed prognostic tool over the previous
ones should also be assessed
Reporting Prognosis research should be reported according to the ‘‘transparent reporting of a multivariable prediction model for
individual prognosis or diagnosis (TRIPOD)’’ initiative recommendations

been developed to try to eliminate or minimize these Participation (study population)


potential biases.
The best design for prognosis research is a prospective Study population should be representative
cohort study because it allows optimal measurement of of the target population
baseline characteristics, predictors and outcome [8]. Ret-
rospective cohort studies minimize study duration at the The population of interest should be clearly described
expense of poorer data because of missing information [8]. according to clinical and demographic characteristics to
Case–control studies are sometimes used, but they do not allow applicability of the study results to the target popu-
allow estimation of absolute risks because the source lation in clinical practice. A well known risk of bias in
population is of unknown size with the exception of case– patient recruitment is the selection bias, occurring when
control study nested in a cohort of known size [9]. Data patients with certain characteristics have a higher proba-
from untreated or control arms of randomized clinical trials bility of being included in the study. The most accredited
(RCTs) can also allow prognosis assessment with appro- method to minimize selection bias is to include consecu-
priate analyses; both arms may be used when the treatment tively observed patients. Consecutive patient inclusion
effect is not significant. However, generalizability of should be documented, and follow-up of excluded patients
prognostic information derived from RCTs may be limited should be available to validate the exclusion criteria.
by strict inclusion criteria. The setting of patient recruitment should be well descri-
The study design should be thoroughly described in a bed to improve results applicability. In fact, tertiary care
detailed study protocol developed on the basis of a well centers usually observe patients with more severe conditions
defined aim of the study, and should include a thorough and worse prognosis than observed in a primary care setting.
statistical analysis plan. Importantly, the study protocol As an example, cirrhotic patients mostly observed in
should be published in an official web site just as is ambulatory clinics had a significantly better survival than
required for clinical trials. This would make more trans- patients mostly observed in hospital in two studies per-
parent the overall study flow, and would make transparent formed at our unit [10, 11] (Fig. 3). Similar results have been
the process of avoiding biases. recently reported in a large population based study in UK

123
254 Intern Emerg Med (2016) 11:251–260

[12]. When the results of predictive studies recruiting mixed population, similar to the study sample. However, in
patients in a specific clinical setting are extended to other clinical practice doctors will have to predict the outcome in
settings, this may generate a bias termed referral bias. the individual patients who will present either with a
compensated or with a decompensated disease, and the
Participants should be enrolled at a similar point probabilities of the outcome of interest are expected to be
(inception point, the time zero for the observation different according to the disease stage.
period) along the course of the disease The importance of time zero in prognosis research may
be illustrated by the two studies of the prognosis of cir-
The time point along the course of the disease from where rhosis performed at our Unit, above reported [10, 11]. The
(time zero) the outcome prediction is made, should be first [11] included 1155 patients between 1974 and 1978,
clearly defined, and should allow the inclusion of a patient the second [10] included 494 patients between 1981 and
sample as homogeneous as possible. It should also be 1984. Ten-year survival is significantly better in the second
easily recognizable in clinical practice to allow wide one (Fig. 3). The explanation for this variation in the
applicability. As an example, if the study population observed survival is the different disease stage. The first
includes patients with both compensated and decompen- study included patients admitted to the hospital for further
sated disease, the study results will most likely apply to a assessment because of a clear clinical evidence of cirrhosis.
The second one included patients observed because of
1.00

hypertransaminasemia who underwent a liver biopsy to


detect those with chronic active hepatitis to be treated
Second study:494 patients (steroids at that time). Patients with cirrhosis did not
0.75
Survival probability

undergo any specific treatment and were included in the


study: the second study included therefore patients with
0.50

cirrhosis diagnosed in an earlier disease stage compared to


those included in the first study. The effect of inclusion in
First study: 1155 patients
the study at an earlier stage of the disease is clearly evident
0.25

when comparing survival of compensated cirrhosis in the


two studies (Fig. 4). Noteworthy, the earlier stage of dis-
0.00

ease in the second study was not clinically evident because


0 24 48 72 96 120
months
of the silent course of the disease in the compensated stage.
By contrast, when comparing survival of patients with a
Fig. 3 Survival probabilities of two cohorts of patients with cirrhosis first diagnosis of ascites in the two studies, no differences
[9, 10] recruited at the same Unit. First study: patients observed are detected (Fig. 5), suggesting that appearance of ascites
mostly in hospital; second study: patients mostly observed in an
ambulatory setting
is a well recognizable time zero or inception point, and

Course of cirrhosis

OLT
Preclinical Clinical Decompensated
Compensated Compensated Cirrhosis
cirrhosis cirrhosis
Out-paents In-paents Death

Second study

Time zero first study

Time zero

Fig. 4 Different time zero for inclusion of patients in two cohort earlier disease stage, observed as out-patients. The time zero for the
studies of cirrhosis performed at the same Unit and yielding observation period along the disease course was substantially shifted
significantly different survival time [9, 10]. In the first study, mostly to the left in the second study, which showed a longer survival for
hospitalized patients with compensated cirrhosis were included. In the patients with compensated cirrhosis
second study patients with compensated cirrhosis were included at an

123
Intern Emerg Med (2016) 11:251–260 255

1.00
No ascites at diagnosis

0.75
Second study

Survival probability
0.50
Firststudy
First study
0.25
Ascites at diagnosis
0.00

0 24 48 72 96 120
months

Fig. 5 Survival of patients with or without ascites on admission in patients were asymptomatic. However, patients with a new diagnosis
two cohort studies of cirrhosis performed at the same Unit [9, 10]. In of ascites at inclusion had a similar survival in both studies,
the first study survival was shorter in patients without ascites because suggesting that the appearance of ascites may be considered a
they were admitted at a more advanced stage in the disease course: reliable inception point along the course of cirrhosis, allowing for the
this was not evident at the beginning of the two studies because the inclusion of homogeneous patients

allows the inclusion of homogeneous patients in prognosis However, the best way to deal with missing values is to
studies of cirrhosis. prevent their occurrence in prospective studies [13].
In prognosis studies, the inclusion of patients at a well
defined inception point in the disease course, improves the
homogeneity of observations, and the applicability of the Candidate prognostic factors selection
study results. and measurement

Candidate predictors should be a priori set and not data


Attrition (complete information) derived: they should be clearly defined, and a description
of methods to measure predictors should be provided, with
Information on baseline characteristics, candidate prog- sufficient details and in a transparent way so as to allow
nostic indicators, confounders, follow-up and outcome, one to reproduce their use in clinical practice as closely as
should be complete enough to allow for thorough evalua- possible to their use in the study population. Predictors
tion of the true association between prognostic indicators with unreliable measurement methods should not be
and the outcome of interest. assessed in a prognosis study. It is also important that the
Completeness of follow-up is essential, and the reasons time when candidate predictors are measured be closely
for loss to follow-up as well as the characteristics of drop- related with the time span to which the outcome is intended
outs, should be known to allow exploration of any asso- to be predicted. As an example, if ascites is a candidate
ciation between key characteristics and loss to follow-up. predictor in a prognosis study of cirrhosis, its presence or
In fact, if loss to the follow-up does not occur randomly absence should be verified at the study beginning. More-
and instead is systematically associated with patients over, if ascites is present, the time when it was first
characteristics, this might also imply a systematic associ- detected should be specified. It is in fact conceivable that
ation of loss to the follow-up with patients outcome thus patients with a new diagnosis of ascites may have a dif-
generating biased study conclusions. This type of bias is ferent life expectancy compared to patients with ascites
known as attrition bias. developed months or years before. In the same way, if a
Missing values are a common issue in prognosis given laboratory variable is assessed as a prognostic indi-
research, and may variously affect the study result because cator, its value should be assessed at the beginning of the
they are usually related to other information, including the observation period or within a short and specified time
outcome of interest. Excluding patients with missing period. Variables assessed after the study start should be
information may lead to biased conclusions if missing more appropriately considered as outcome variables: if
information is related to important prognostic indicators or variables assessed during the observation period have to be
to the outcome. Several imputation models allow for sub- assessed for their prognostic value, they should be assessed
stitution of the missing value with the most likely one. as time dependent variables. Theoretically, all variables

123
256 Intern Emerg Med (2016) 11:251–260

potentially associated with the outcome on the basis of to simplify the analysis, graphical presentation of results,
previous knowledge and clinically or of biologically and to facilitate their use in clinical practice. However,
plausible hypotheses could be included even if the asso- transforming a continuous predictor into a discrete one
ciation is not credited to be a causal one [14]. In retro- should be avoided because it loses information when cat-
spective studies, assessment and recording of candidate egorized: this is usually based on arbitrary or, even worse,
prognostic factors should be blinded to the outcome [1, 13] data driven cut-off points with the result of losing the
and vice versa. information of the true relationship between the variable
As a rule of thumb, the number of candidate prognostic and the outcome (the fewer the chosen categories, the
variables should be less than one every ten outcome events greater is the amount of loss information) [18]. Moreover,
observed in the study [15] to avoid model overfitting [16]. the process of selecting the best cut-off points by data
Appropriate methods to reduce the number of predictors analysis, will produce overoptimistic results with unsatis-
are available [15]. However, underfitting should also be factory performance in validation studies.
avoided. Underfitting occurs when an insufficient number Keeping variables continuous allows a full exploration
of outcome events does not warrant enough power to detect of the true relationship between the variable and the out-
important associations, thus omitting potentially important come, although linearity of the relationship should be
variables [16]. checked, and appropriate transformations (e.g. fractional
polynomial) may be used to allow for some non-linearity
[17].
Outcome The statistical analysis should be appropriate for the
study design limiting the risk of reaching invalid results:
In prognosis studies, outcomes usually include death or some the logistic (binary outcomes) and the proportional hazard
non fatal disease complication (for example occurrence of (time to event outcome) models are most frequently used
ascites or upper digestive bleeding in cirrhosis, myocardial [14, 17]. Competing risks analysis should be used when-
infarction, cancer recurrence) or patient-centered outcomes, ever a competing outcome may hamper the occurrence of
such as symptoms, functional status or quality of life. The the outcome of interest: as an example, when assessing the
outcome to be predicted has to be clinically relevant, clearly death risk in liver cirrhosis, liver transplant is a competing
defined and easily reproducible, including the time when it is outcome because it reduces the occurrence of death. In this
measured, to minimize inter-observer variation. The out- situation the usual analyses of time to event based on the
come and method of measurement should be adequately Kaplan–Meier method for assessing risks and on the pro-
validated to avoid patient misclassification. When needed, portional hazards Cox model for adjusted hazard ratios,
technical details to detect the outcome should be specified, may not be used because the assumptions for correct cen-
even when it may seem obvious: as an example if the soring (non relevance to the outcome and independence
development of ascites in a prognosis study of cirrhosis is the from the outcome) are not verified [19]. In this condition,
outcome of interest, the method to detect ascites should be the competing risks analysis (or the Cumulative Incidence
reported. In fact, ascites may be detected on physical Function, CIF [20]) allows for unbiased estimate of the risk
examination, but small amounts of ascites may only be and the proportional hazards model for competing risks.
detected on ultrasound examination of the abdomen, and the The Fine and Grey model [21], allows for multivariable
sensitivity of ultrasound studies to detect ascites may be analyses and for deriving prognostic models.
different according to specific maneuvers or technique. Two major analysis strategies are used to arrive at the
Outcome assessment should be blinded to the candidate final prognostic model, although there is no consensus on
prognostic factors [1, 17]. This is particularly important which is preferable: the full model and the predictor
when the outcome is not death, or whenever the outcome selection strategy [17, 22, 23]. Presentation of data should
definition includes any subjective judgement. The reason is be planned to allow for assessment of consistency with
that the knowledge of predictors may influence the out- results.
come adjudication. Methods to blind the outcome assess- In the full model approach, all the candidate predictors
ment should also be clearly reported. are included in the multivariable analysis: an advantage of
this approach is that it reduces the risk of predictor selec-
tion bias (inclusion of spurious predictors in the final
Statistical analysis model) and overfitting (reduced number of statistical tests).
This technique is, however, difficult to apply when the
Correct handling of candidate prognostic variables is number of events is limited.
important to achieve unbiased results. As an example, The predictor selection strategy is based on the
categorization of continuous variables is commonly done sequential exclusion of candidate predictors that do not

123
Intern Emerg Med (2016) 11:251–260 257

contribute usefully to the multivariable model. This may be The baseline probability or risk of developing the
achieved by a backward procedure starting from the full outcome of interest is also provided by the multivari-
model, and removing at each step the least contributive able analysis in prognosis research. This parameter
variable, or by a forward procedure, by adding at each step estimates the risk for an individual with all predictor
a variable that improves the model significance. Forward values being zero. To calculate the predicted probability
selection has the disadvantage of not allowing a simulta- or the risk for an individual patient in a given time
neous assessment of all the variables, while it has the period, the baseline risk is combined with the observed
advantage of not including correlated variables that instead values of the predictors and the corresponding regres-
may remain in the model by the backward selection. The sion coefficient in a mathematical function specific to
full model approach may decrease the risk of overfitting, the analysis used (most frequently Cox or logistic
but it is often impractical to include all the variables. On models) [17].
the other hand, backward or forward selection imply
repeating many statistical tests and increase the risk of
overfitting. However, selected predictor variables with very Assessing performance of the prognostic model
small P values (say, \0.001) are much less prone to
selection bias and overfitting than weak predictors. The performance of a prognostic model may be assessed in
Whichever selection strategy is used, it is not recom- terms of calibration and discrimination. Calibration deals
mended to select variables solely on the base of univariable with the precision of a model, or how much the predicted
analysis, which may inform on the association of individual probability that the outcome event will occur is close to the
predictors with the outcome but not on the interplay of observed occurrence rate [24]. It is ideally assessed
different predictors: clinical judgement and biological graphically by plotting the observed frequencies of out-
plausibility should always be the underlying guide to the come against the mean predicted probability within sub-
predictor selection criteria [17, 22, 23]. groups of participants (say deciles). The Hosmer and
Lemeshow test may provide formal statistical assessment
of the correspondence between predicted and observed
Confounding probabilities [24–26].
Discrimination is the ability of the model to discriminate
Confounding may be generated when one variable is associated between individuals with and without the outcome [24, 25].
to another variable as well as to the outcome. In this situation, It measures the chance that the model will assign a higher
the prognostic value of the variable truly associated with the probability of outcome to the patients who actually will
outcome may be erroneously attributed to the other variable. develop the outcome than to those who will not, and can be
Important potential confounders should be adequately estimated both for logistic models and for survival models
accounted for. Their identification is primarily based on pre- [25]. This is typically assessed by the area under the
vious knowledge from medical literature and from clinical receiver operating curve or the equivalent concordance
judgement. Definitions, modality of measurement and han- index (c-index). It can be defined as the proportion of
dling of missing information should follow the same recom- patients pairs for which predictions and outcomes are
mendations for confounders as for candidate prognostic concordant. Versions of the c-index allowing for censoring,
indicators. Confounders should be properly accounted for by and thus suitable for survival analysis, have been also
appropriate statistical analyses, and should include any treat- developed [17, 25].
ment potentially affecting the outcome of interest [17, 23]. More recently, further performance measures have been
introduced, mainly to assess the added (or incremental)
value of a new predictor, or a new predictive model, with
Prognostic model respect to the already used predictors or predictive model.
Among these, one frequently used is the net reclassification
The prognostic model is derived from the significant improvement index (NRI) [16, 25], which is based on the
variables in the final multivariable analysis. The analysis assessment of the number of correct reclassifications of
provides the regression coefficients per each predictor in patients at high or low risk with the new prognostic marker
the model. The regression coefficients quantify the con- or prognostic score.
tribution to the outcome of interest of each predictor: they In general, however, the performance of a prognostic
provide a measure of the effect of a one-unit increase in the model is usually satisfactory in the patient sample where
level of the predictor on the outcome risk when the other the model has been developed, but may be unsatisfactory in
predictors in the model are kept constant. independent samples.

123
258 Intern Emerg Med (2016) 11:251–260

Internal validation derivation studies or in different countries. In these studies


the likely different case mix, predictors and outcome
Bootstrapping techniques may help to reduce the optimism measurements, also provide an assessment of generaliz-
of model performance in the dataset in which the model has ability of the model. A wider proof of validity of a prog-
been derived. By randomly selecting a number of samples nostic model is the domain validation in which the new
(usually 100–500) from the patients included in the model performance is assessed in very different types of
derivation sample, bootstrapping mimics the variability patients. Examples of domain validation are the use of a
that may be encountered when applying the derived model model developed in secondary care in patients observed in
in different patient samples [17, 23]. This kind of analysis a primary care setting, or a model developed in adults
may produce two major results: first, the coefficients of the validated in children.
predictors in the model may be adjusted according to the
different estimates in the random samples, so called coef-
ficient shrinkage [17, 23]; second, the variation of the Updating a model
c-statistic for discrimination in the random samples may
provide a measure of the optimism of the model in the Using a model in a setting different from that where it was
derivation sample, and the average c-statistic may provide developed, may result in poor performance. In this case,
a less optimistic assessment of the model performance in updating the model or recalibrating it for a different setting
the derivation sample. or clinical condition, may be more appropriate than
Clearly, the larger the derivation sample, the smaller the developing a new one, because the development of a new
optimism reduction achieved by bootstrapping, because the model may lose the scientific information conveyed in the
larger the derivation sample, the more reliable are the study previous model. Updating a model may be achieved by the
results. However, even when a satisfactory internal vali- simple adjustment of baseline risk, or by re-estimating the
dation has been achieved, this may never be considered a regression coefficients of all or some predictors, or by
substitute for external validation, which is instead the true adding or removing some predictors to or from the previ-
validation of prognostic models. ous model [27, 28]. Like any newly developed prediction
model, an updated model needs to be validated.

External validation
Impact analysis
To be useful in clinical practice, a prognostic model has to
be reproducible or valid in patient samples independent of A prognostic model needs to provide measurable benefit to
the derivation sample [25]. External validation is aimed to patients, physicians or to the health care system to be used
assess the new prognostic score performance in new indi- in clinical practice. If a beneficial effect has not been
viduals not included in the derivation study. Subjects proven, the use of a predictive model is inappropriate, and
included in validation studies should be at risk of devel- may result only in waste of time. Studies of the impact of a
oping the same outcome for which the prognostic model prediction model are based on the comparison of patient
has been derived, but may be sampled in different clinical outcome with or without the use of the model [28–30]. A
situations or settings: the more these situations differ from reliable study of the impact of a predictive model would
the derivation study, the more generalizable the prognostic require randomization of physicians or care united to either
score will result. To assess the performance of the new use the new prediction rule (control group) or to use the old
prognostic score, its value is calculated per each patient prediction rule or clinical judgement, in a cluster ran-
included in the validation study. The predicted probability domized trial [27, 31, 32] where the patient outcome is
of the outcome is then compared with the observed out- compared. Randomization of patients may be an alternative
come occurrence, and the performance of the model is although it is not encouraged because the learning effect
reported in terms of discrimination, calibration and may reduce the difference between the compared strategies
reclassification. [33, 34]. By contrast, randomizing centers prevents con-
A hierarchy of validation studies has been proposed tamination (sharing of experience and information between
[24]. Main validation areas are temporal, geographical and physicians) that may also attenuate the predictive model’s
domain validation [26, 27]. In temporal validation, per- effect. The stepped- wedge cluster design randomly assigns
formance of the new prognostic score is assessed in a later to centers a time period when they are given the new
time-period in the same institution(s) where the model was prognostic tool. In this way, all the participating centers
developed. In geographical validation studies, the model is will be applying both the usual care and the new predictive
applied to patients observed in centers not involved in the model at times randomly ordered across centers. Also in

123
Intern Emerg Med (2016) 11:251–260 259

this study design, the impact measure is derived by the Concluding remarks
comparison of patient outcome. Although these random-
ized study designs allow for unbiased estimates, they are Prognosis research is based on strict scientific methods [36]
very demanding, time consuming and expensive [27]. (Box 1). It provides information on predictors of clinically
When the impact analysis is aimed at assessing changes in relevant outcomes. The prospective cohort is the most
the physicians’ behaviors or changes in decision-making of appropriate design for prognosis studies. Seven major
care providers, a cross-sectional design with health-care methodological areas are identified related to the inclusion
professionals’ decisions as the primary outcome may be of patients, completeness of information, measures of
used. This type of design does not require follow-up, thus prognostic factors, outcome and confounders, and statisti-
being less time consuming and less expensive. A simpler cal analysis. The interested reader may, however, find
study design is the ‘‘before-after’’ impact analysis that information on other potential sources of bias in previous
compares the management decisions of healthcare provi- reviews on methodology and requirements for avoiding
ders for the same individuals before and after they have biases in prognosis studies [36–40]. Publication of study
been provided with the new model predictions. This design protocol and statistical analysis plan before the study start
does not require follow-up, is relatively easy and not very should be encouraged as well as the publication of the
expensive [27]. However, complete control of bias may be analyzed data in a public register at the end of the study,
impossible. This would improve transparency of study conduct and
results reporting, and would allow the knowledgeable
reader to reproduce the results [41]. The Institute of
Interpretation and reporting Medicine of the National Academies of Sciences,
Engineering and Medicine provides the opportunity of
Interpretation of the study should account for any potential doing this at http://iom.nationalacademies.org/About-IOM/
limiting factor for the study relevance [7, 35]. Internal Study-Process.aspx. When a prognostic model, or predic-
validation should be adequately interpreted to address any tion rule is proposed, properly designed validation studies
overoptimistic evaluation of the study results. It is also should inform on its generalizability, and impact analyses
important to thoroughly account for any potential con- should investigate the benefit of its application in clinical
founder. Interpretation of results should put the new pro- practice.
posed prognostic indicator or prognostic score in the
context of previous knowledge avoiding the suggestion of Compliance with ethical standards
the use of a new predictive tool that does not improve the Conflict of interest The authors declare that they have no conflict
accuracy of outcome prediction particularly if the predic- of interest.
tors already in clinical practice are simpler and easier to
use. Statement of human and animal rights This article does not
contain any studies with human participants or animals performed by
A complete and clear report of a study developing a any of the authors.
prediction model is needed to assess the risk of bias and the
potential usefulness of the model for clinical practice. Informed consent None.
However, several reviews of the medical literature have
shown that the quality of reporting of prediction model
studies is largely poor. For this reason a set of recom-
mendations for the reporting of studies in developing, References
validating or updating a prediction model has been recently
1. Hayden JA, Cote P, Bombardier C (2006) Evaluation of the
developed by the transparent reporting of a multivariable quality of prognosis studies in systematic reviews. Ann Intern
prediction model for individual prognosis or diagnosis Med 144:427–437
(TRIPOD) initiative [7, 35]. The statement consists of a 2. Kyzas PA, Denaxa-Kyza D, Ioannidis JPA (2007) Quality of
reporting of cancer prognostic marker studies: association with
checklist of 22 items considered important for transparent
reported prognostic effect. J Natl Cancer Inst 99:236–243
reporting of a prediction model study. The checklist 3. Hingorani DA, van der Windt DA, Riley RD, Moons KGM,
encompasses the full report from title and abstract to dis- Steyerberg EW, Schroter S et al (2013) Prognosis research strategy
cussion, including introduction, methods and results. Per (PROGRESS) 4: stratified medicine research. BMJ 346:e5793
4. Pugh RN, Murray-Lyon IM, Dawson JL, Pietrni MC, Williams R
each of the 22 items, a question is reported to which the
(1973) Transection of the esophagus for bleeding oesophageal
study authors have to reply, indicating the page of the varices. Br J Surg 60:646–649
article where the item is dealt with. A suggestion is made to 5. Kamath PS, Wiesner RH, Malinchoc M, Kremers W, Therneau
include the checklist with the article submission to medical TM, Kosberg CL et al (2001) A model to predict survival in
patients with endstage liver disease. Hepatology 33:464–470
journals.

123
260 Intern Emerg Med (2016) 11:251–260

6. D’Amico G, Garcia-Tsao G, Pagliaro L (2006) Natural history 24. Justice AC, Covinsky KE, Berlin JA (1999) Assessing the gener-
and prognostic factors in cirrhosis: a systematic review of 118 alizability of prognostic information. Ann Intern Med 130:515–524
studies. J Hepatol 44:217–231 25. Steyerberg WE, Vickers AJ, Cook N, Gerds T, Gonen M, Obu-
7. Moons KGM, Altman DG, Reitsma JB, Collins GS, Ioannidis PA, chowski N et al (2010) Assessing the performance of prediction
Macaskill P et al (2015) Transparent reporting of a multivariable models. A framework for traditional and novel measures. Epi-
prediction model for individual prognosis or diagnosis (TRIPOD): demiology 21:128–138
explanation and elaboration. Ann Int Med 162:W1–W73 26. Altman DG, Vergouwe Y, Royston P, Moons KG (2009) Prog-
8. Grimes DA, Schulz KF (2002) Cohort studies: marching towards nosis and prognostic research: validating a prognostic model.
outcomes. Lancet 359:341–345 BMJ 338:b605
9. Di Pietro NA (2010) Methods in epidemiology: observational 27. Moons KGM, Kengne AP, Grobbee DE, Royston P, Vergouwe Y,
study designs. Pharmacotherapy 30:973–984 Altman GD, Woodward M (2012) Risk prediction models: II.
10. D’Amico G, Pasta L, Morabito A, D’Amico M, Caltagirone M, Esternal validation, model updating and impact assessment. Heart
Malizia G et al (2014) Competing risks and prognostic stages in 98:691–698
cirrhosis: a 25-year inception cohort study of 494 patients. Ali- 28. Toll DB, Janssen KJM, Vergouwe Y, Moons KGM (2008) Val-
ment Pharmacol Ther 39:1180–1193 idating, updating and impact of clinical prediction rules: a review.
11. D’Amico G, Morabito A, Pagliaro L, Marubini E, The liver study J Clin Epidemiol 61:1085–1094
group of ‘‘V Cervello Hospital’’ (1986) Survival and prognostic 29. Reilly BM, Evans AT (2006) Translating clinical research into
indicators of compensated and decompensated cirrhosis. Dig Dis clinical practice: impact of using prediction rules to make deci-
Sci 31:468–475 sions. Ann Int Med 144:201–209
12. Ratib S, Fleming KM, Crooks JC, Aithal G, West J (2014) 1 and 30. Moons KGM, Altman DG, Vergouwe Y, Royston P (2009)
5 year survival estimates for people with cirrhosis of the liver in Prognosis and prognostic research: application and impact of
England, 1998–2009: a large population study. J Hepatol prognostic models in clinical practice. BMJ 338:b606
60:282–289 31. McGinnTH Guyatt GH, Wyer PC et al (2000) User’s guide to
13. Donders AR, van der Heijden GJ, Stijnen T (2006) Review: a medical literature: XXII. How to use articles about clinical
gentle introduction to imputation of missing data. J Clin Epi- decision rules. Evidence based medicine working group. JAMA
demiol 59:1087–1091 284:79–84
14. Royston P, Moons KG, Altman DG, Vergouwe Y (2009) Prog- 32. Campbel MK, Elborne DR, Altman DC (2014) CONSORT
nosis and prognostic research: developing a prognostic model. statement extension to cluster randomized trials. BMJ
BMJ 338:b604 328:702–708
15. Harrell FE Jr (1985) Regression models for prognostic prediction: 33. Hall LM, Jung RT, Leese GP (2003) Controlled trial of effect of
advantages, problems, and suggested solutions. Cancer Treat Rep documented cardiovascular risk scores on prescribing. BMJ
69:1071–1077 326:251–252
16. Concato J, Feinstein AR, Holford TR (1993) The risk of deter- 34. Toll DB, Janssen KJM, Vergouwe Y, Moons KGM (2008) Val-
mining risk with multivariable models. Ann Intern Med idating, Updating and impact of clinical prediction rules: a
118:201–210 review. J Clin Epidemiol 61:1085–1094
17. Moons KGM, Kengne AP, Woodward M, Royston P, Vergouwe 35. Collins GS, Reistma JB, Altman DG, Moons KGM (2015)
Y, Altman GD, Grobbee DE (2012) Risk prediction models: I. Transparent reporting of a multivariable prediction model for
Development, internal validation, and assessing the incremental individual prognosis or diagnosis (TRIPOD): the TRIPOD
value of a new (bio)marker. Heart 98:683–690 statement. Ann Int Med 162:55–63
18. Royston P, Altman DG, Sauerbrei W (2006) Dichotomizing 36. Hemingway H, Riley DR, Altman DG (2009) Ten steps towards
continuous predictors in multiple regression: a bad idea. Stat Med improving prognosis research. BMJ 339:b4184
25:127–141 37. Hemingway H, Croft P, Perel P, Hayden JA, Abrams K, Timmis
19. Jepsen P, Wilstrup H, Andersen PK (2015) The clinical course of A et al (2013) Prognosis research strategy (PROGRESS) 1: a
cirrhosis. The importance of multistate models and competing framework for researching clinical outcomes. BMJ 346:e5595
risks analysis. Hepatology 62:292–302 38. Riley RD, Hayden JA, Steyerberg EW, Moons KG, Abrams K,
20. Pintile M (2006) Competing risks. A practical perspective. Wiley, Kyzas PA et al (2013) Prognosis research strategy (PROGRESS)
Chichester 2: prognostic factor research. PLoS Med 10(2):e1001380
21. Fine JP, Gray RJ (1999) A proportional hazards model for the 39. Steyerberg EW, Moons KG, van der Windt DA, Hayden JA,
subdistribution of competing risk. J Am Stat Ass 94:496–509 Perel P, Schroter S et al (2013) Prognosis research strategy
22. Moons KGM, Royston P, Vergouwe Y, Grobbee DE, Altman DG (PROGRESS) 3: prognostic model research. PLoS Med
(2009) Prognosis and prognostic research: what, why and how? 10(2):e1001381
BMJ 339:b375 40. PLOS Medicine Editors (2014) Observational studies: getting
23. Harrel FE, Lee KL, Mark DB (1996) Tutorial in biostatistics. clear about transparency. PLoS Med 11(8):e1001711
Multivariable prognostic models: issues in developing models, 41. The Nordic Trial Alliance Working Group 6 on transparency and
evaluating assumptions and adequacy, and measuring and registration. 2015 http://nta.nordforsk.org/projects/FINALNTAWP
reducing errors. Stat Med 15:361–387 G30032015

123

You might also like