0% found this document useful (0 votes)
62 views29 pages

Oral Exam Preparation For Week 4

The document outlines key concepts and preparation materials for an oral exam in BIS 630, focusing on survival analysis. It covers foundational statistical concepts, characteristics of survival time, types of censoring, key functions, and the Cox proportional hazards model, among other topics. Each section includes main questions, follow-up questions, and key answer points to aid in understanding and application of survival analysis techniques.

Uploaded by

winniexvvv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views29 pages

Oral Exam Preparation For Week 4

The document outlines key concepts and preparation materials for an oral exam in BIS 630, focusing on survival analysis. It covers foundational statistical concepts, characteristics of survival time, types of censoring, key functions, and the Cox proportional hazards model, among other topics. Each section includes main questions, follow-up questions, and key answer points to aid in understanding and application of survival analysis techniques.

Uploaded by

winniexvvv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Oral Exam Preparation for Week 1 (BIS 630)

1. Concepts from Introductory Statistics

Main Question:
"What statistical concepts from introductory courses are foundational for survival analysis?"

Follow-up Questions:

 "How is regression used in survival analysis?"


 "Why are residuals important in model diagnostics?"

Answer Key Points:

 Concepts include: random variables, distributions (pdf, cdf), estimation methods (MLE), hypothesis
testing, regression models, and diagnostics.
 Survival analysis extends these ideas with time-to-event data, censoring, and specialized models
like the Cox model.

2. Definition and Characteristics of Survival Time

Main Question:
"What is survival time, and what are its typical characteristics?"

Follow-up Questions:

 "Can you give examples of survival time in different fields?"


 "Is survival time typically normally distributed?"

Answer Key Points:

 Survival time measures duration from an origin point to a specific event (e.g., death, relapse).
 Key characteristics: non-negative, typically right-skewed, often censored.
 Normal distribution is inappropriate; special models are needed.

3. Types of Censoring

Main Question:
"What are the different types of censoring in survival data?"
Answer Key Points:

 Types of censoring:
o Right censoring: Event has not yet occurred by the end of observation (most common).
o Left censoring: Event occurred before observation starts.
o Interval censoring: Event known to occur within a time interval.

 In right-censoring, we only know the event time exceeds a certain threshold.

4. Key Functions in Survival Analysis

Main Question:
"What are the key functions used to describe survival data?"

Follow-up Questions:

 "How are the survival function and hazard function related?"


 "What is the cumulative hazard function?"

Answer Key Points:

5. Independent Censoring Assumption

Main Question:
"What does the independent censoring assumption mean in survival analysis?"

Follow-up Questions:

 "Why is this assumption important for estimation?"


 "Can you give an example where independent censoring might be violated?"

Answer Key Points:


 Independent censoring means the mechanism causing censoring is unrelated to the risk of the
event.
 Violations (e.g., if patients leave a study due to worsening health) can bias survival estimates.
 Independent censoring ensures valid estimation using Kaplan-Meier, Cox, and other standard
methods.

6. Likelihood for Right-Censored Data

Main Question:
"How is the likelihood function constructed for right-censored survival data?"

Follow-up Questions:

 "How does censoring affect the likelihood contribution of an observation?"


 "What assumptions are needed?"

Answer Key Points:

 Likelihood for each individual:


o If event observed: contributes f(ti).
o If censored: contributes S(ti).

 Full likelihood:

 Assumes independence between censoring and event times.


Oral Exam Preparation for Week 2 (BIS 630)

📈 1. Kaplan-Meier Estimator for S(t)

Main Question:
"What is the Kaplan-Meier estimator, and how is it constructed?"

Follow-up Questions:

 "Why is it called a step function?"


 "How does censoring affect the Kaplan-Meier estimate?"

Answer Key Points:

 KM estimator formula:

 Step function: updates only at event times.


 Censoring affects the number at risk ni but not the number of events di.
 At each event time: conditional probability of surviving multiplied together.

🔍 2. Cumulative Hazard Function Estimation

Main Question:
"How can the cumulative hazard function be estimated from survival data?"

Follow-up Questions:

 "What is the Nelson-Aalen estimator?"


 "How is it related to the Kaplan-Meier estimator?"
Answer Key Points:

 Conventionally, Kaplan-Meier is used for survival function, Nelson-Aalen for cumulative hazard.

🧮 3. Estimating Quantiles (tp)(图)

Main Question:
"How do we estimate quantiles, such as the median survival time, in survival analysis?"

Follow-up Questions:

 "What if the survival curve is flat at exactly 1-p?"


 "How does the R survival package handle this?"

⚔️4. Log-Rank Test for Comparing Two Survival Distributions

Main Question:
"What is the purpose of the log-rank test, and how is it performed?"

Follow-up Questions:

 "What are the null and alternative hypotheses?"


 "Why is the log-rank test powerful under proportional hazards?"
性质:

Independent censoring;non- parameters;most powerful ph;kai-square;

🎯 5. Practical Notes on Log-Rank Test

Main Question:
"What are the limitations of the log-rank test?"

Follow-up Questions:

 "What happens if survival curves cross?"


 "Is log-rank test still valid without proportional hazards?"

Answer Key Points:

 If survival curves cross, log-rank test loses power.


 It tests global differences across time, not just at a single point.
 Formal proportional hazards assumption is important (but log-rank doesn't check it explicitly).
Oral Exam Preparation for Week 3 (BIS 630)

Cox Proportional Hazards (PH) Model Basics

Main Question:
"What is the Cox proportional hazards model and what are its key components?"

Follow-up Questions:

 "What is the baseline hazard?"


 "What does the model assume?"

Answer Key Points:

 The Cox PH model:

 Semi-parametric: no assumptions about the shape of h0(t), but assumes proportional hazards.
 The baseline hazard h0(t) is the hazard when Z=0.
 Hazard ratio between two individuals depends only on covariates, not on time.

Hazard Ratio Interpretation

Main Question:
"What is a hazard ratio and how do you interpret it in the Cox model?"

Follow-up Questions:

 "What if you have a continuous covariate?"


 "How would you interpret an estimated HR of 0.3?"

Partial Likelihood (No Ties)

Main Question:
"How is the partial likelihood constructed in the Cox model?"

Follow-up Questions:

 "What is the role of the risk set?"


 "Why is it called 'partial'?"

Handling Tied Event Times

Main Question:
"What methods are used to handle tied survival times in Cox models?"

Follow-up Questions:
 "What are the trade-offs between the methods?"
 "Which method is used by default in R?"

Answer Key Points:

 Three main methods:


1. Exact method: computationally expensive, accurate.
2. Breslow: simplest, but can bias estimates toward null if many ties.
3. Efron: better approximation, used as default in R (coxph).
 Tied times happen when events occur at the same observed time.
 Use ties = "efron" in coxph().
Oral Exam Preparation for Week 4 (BIS 630)

1. Hypothesis Testing in Cox Models

Main Question:
"What are the three main types of hypothesis tests used for Cox models, and how do they differ?"

Follow-up Questions:

 "When would you prefer using a score test instead of a Wald test?"
 "What is the advantage of the likelihood ratio test?"

Answer Key Points:

 Wald test: Based on the asymptotic normality of MLE; easy after fitting the model.
 Score test: Only needs estimation under the null; used when fitting full model is difficult.
 Likelihood Ratio test: Compares the likelihood under null and alternative; most robust among the three.

2. Different Types of Residuals in Cox Models

Main Question:
"Can you describe different types of residuals used in Cox regression and their purposes?"
Follow-up Questions:

 "Which residual is best for detecting outliers?"


 "Which residual is used to check the functional form of covariates?"

Answer Key Points:

 Cox-Snell residuals: Assess overall model fit (should follow Exponential(1)).


 Martingale residuals: Check functional form of covariates; skewed, not good for outlier detection.
 Deviance residuals: Symmetric version of Martingale; used for identifying outliers.
 Schoenfeld residuals: Check time-dependence of covariates, especially PH assumption.

3. Visual Tools for Model Fit Evaluation

Main Question:
"What are the key plots used to assess the fit of a Cox model?"

Follow-up Questions:

 "How would you interpret a Cox-Snell residual plot?"


 "Why are martingale residual plots useful?"

Answer Key Points:

 Cox-Snell plots: Should approximate a straight line with slope 1 through the origin.
 Martingale residual plots: Help determine if a covariate needs transformation (nonlinearity). (functional
form)
 Deviance residuals: Index plots detect outliers.
 Warning: Visual tools are subjective; interpretation can vary, especially with small samples.

4. Examining the Proportional Hazards Assumption

Main Question:
"How can you evaluate the proportional hazards assumption in Cox models?"

Follow-up Questions:

 "What are the graphical methods available?"


 "What formal tests can be conducted?"

Answer Key Points:


 Graphical methods:
o Log(-log(Survival)) plots: Curves should be roughly parallel.
o Scaled Schoenfeld residuals vs time: Horizontal line suggests PH assumption holds.

 Formal test:
o Based on scaled Schoenfeld residuals (Grambsch and Therneau test).
o Test H0: PH assumption holds vs Ha: PH assumption violated.

 Alternative approach:
o Add time-dependent covariates to the model and check their significance.

5. Time-dependent Covariates Approach

Main Question:
"How can time-dependent covariates be used to check the PH assumption?"

Follow-up Questions:

 "What is the interpretation if the time-dependent covariate is significant?"


 "Why can't we treat time-dependent covariates the same way as baseline covariates?"

Answer Key Points:

 Introduce an interaction term between a covariate and time (e.g., Z1×tZ1 \times tZ1×t).
 If interaction term is significant, PH assumption is violated.
 Time-dependent covariates change at each event time; complicate partial likelihood calculation.

Oral Exam Preparation for Week 5 (BIS 630)

1. Introduction to Time-Dependent Covariates

Main Question:

"What is a time-dependent covariate, and how does it differ from a time-independent covariate?"

Follow-up Questions:

 "Can you give examples of external and internal time-dependent covariates?"

 "Why is it risky to control for internal time-dependent covariates?"

Answer Key Points:

 Time-dependent covariate: A variable whose value can change over time during the follow-up period.

 External covariate: Changes independently of the subject (e.g., air temperature).


 Internal covariate: Depends on the subject's health status (e.g., blood pressure).

 Risk: Controlling for internal covariates affected by treatment may obscure the true treatment effect.

2. Modeling with Time-Dependent Covariates

Main Question:

"How is the Cox model modified to include time-dependent covariates?"

Follow-up Questions:

 "What is the interpretation of the hazard ratio in a time-dependent Cox model?"

 "Is the hazard ratio still constant over time?"

Add a interaction term of the covariate and the functions of time

3. Stanford Heart Transplant Study

Main Question:

"Why do we need to use a time-dependent Cox model in the Stanford heart transplant study?"

Follow-up Questions:

 "What is immortal time bias?"

 "How do you define the transplant covariate in this study?"

Answer Key Points:

 Patients initially have no transplant, status changes when they receive transplant.

 Immortal time bias: If a subject must survive long enough to receive a treatment, incorrectly modeling it

can falsely exaggerate treatment effects.

 Correct model: Define Z(t)as 1 if transplanted before time t, 0 otherwise.


4. Data Formatting for Time-Dependent Covariates

Main Question:

"How should the data be formatted to fit a time-dependent Cox model?"

Follow-up Questions:

 "What is the (Start, Stop, Status) format (counting process format)?"

 "How is the event indicator coded across time intervals?"

Answer Key Points:

 Data needs to be split into multiple intervals where covariate values are constant.

 Each record includes Start time, Stop time, and Status (event indicator: 1=event, 0=no event).

 Only the last interval (if uncensored) has event = 1; others have event = 0.

5. Estimation in Time-Dependent Cox Model

Main Question:

"How is the partial likelihood modified for time-dependent covariates?"

Follow-up Questions:

 "Why do we need to know covariate values at each death time?"

 "What challenges might arise when estimating with internal time-dependent covariates?"

6. Exercise: Cumulative Exposure Example

Main Question:

"In the smoking and lung cancer example, how would you model cumulative exposure?"

Follow-up Questions:
 "What would the hazard ratio represent?"

 "Why is cumulative exposure considered time-dependent?"

Oral Exam Preparation for Week 6 (BIS 630)

Motivation for Parametric Models

Main Question:

"Why might we consider using parametric survival models instead of Cox models?"

Follow-up Questions:

 "What are the advantages of parametric models?"

 "When is a parametric model particularly useful?"

Answer Key Points:

 PH assumption in Cox may not hold.

 Parametric models can be more efficient if correctly specified.

 Easier to handle complex censoring.

 Direct interpretation on survival time (especially AFT models).

2️⃣ Popular Parametric Distributions

Main Question:

"What are the common distributions used in survival analysis and their characteristics?"

Follow-up Questions:

 "How does the Weibull distribution relate to exponential?"

 "Which distributions allow crossing hazards?"

Answer Key Points:


 Common distributions:

o Exponential: constant hazard (memoryless).

o Weibull: increasing or decreasing hazard depending on shape parameter.

o Log-normal: hazard increases then decreases (non-monotonic).

o Log-logistic: similar to log-normal but different tail behavior.

 Weibull generalizes exponential (Weibull with shape = 1 is exponential).

 Log-normal and log-logistic allow crossing hazards.

Accelerated Failure Time (AFT) Model Basics

Main Question:

"What is the structure of the accelerated failure time model?"

Follow-up Questions:

 "How is the acceleration factor interpreted?"

 "How is the survival function shifted under AFT?"

AFT assumption: covariates affect survival time by accelerating or decelerating the entire survival distr by a constant

factor

4️⃣ Interpretation of Coefficients in AFT Models

Main Question:
"How do you interpret a positive or negative coefficient in an AFT model?"

Follow-up Questions:

 "What happens if time ratio: exp(β) > 1?"

 "What if exp(β) < 1?"

Answer Key Points:

 exp(β) > 1: Time is stretched → slower event → longer survival.

 exp(β) < 1: Time is compressed → faster event → shorter survival.

Example:

 exp(β) = 1.25 → survival time increases by 25%.

 exp(β) = 0.8 → survival time decreases by 20%.

5️⃣ Weibull Model: Both PH and AFT

Main Question:

"Why is the Weibull distribution special in survival analysis?"

Follow-up Questions:

 "What happens to the coefficients under PH vs AFT for Weibull?"

Answer Key Points:

 Weibull satisfies both Proportional Hazards (PH) and AFT properties.

 In AFT: log(T) is linear in covariates.

 In PH: log(h(t)) is linear in covariates.

 Transformation between models:

o From AFT to PH coefficients: −γβ or −β/σ.

6️⃣ Proportional Odds Model

Main Question:

"What is the proportional odds (PO) model in survival analysis?"

Follow-up Questions:

 "How is it different from proportional hazards?"


 "Which distribution satisfies PO?"

7️⃣ Model Checking: Graphical Methods

Main Question:

"How can we graphically assess the fit of a parametric survival model?"

Follow-up Questions:

 "What plots are used for exponential, Weibull, log-logistic, log-normal?"

Percentile - percentile plot: exam the aft assumption

If the aft is valid, the plot should be a line through origin and the slope is the acceleration factor.
Handle non-PH:

1. Time dependent

2. Parametric model

3. Stratified model

Oral Exam Preparation for Week 10 (BIS 630)

Stratified Cox Models

Main Question:

"When and why would you use a stratified Cox model?"

Follow-up Questions:

 "How is the hazard function defined in a stratified Cox model?"

 "What are the limitations of using stratification?"


can we develop a different summary to quantify the covariate’s effect

Restricted mean survival time (RMST)

Testing for Heterogeneous Effects

Main Question:

"How can you test if the covariate effect differs across strata in a stratified Cox model?"

Follow-up Questions:

 "What does the likelihood ratio test compare?"

 "What is the null hypothesis?"

Answer Key Points:

 Test whether β is constant across strata.

 Compare:

o Stratified model: shared β across strata

o Separate models: different β per stratum

 Null: No interaction between covariates and strata

Restricted Mean Survival Time (RMST)

Main Question:

"What is restricted mean survival time, and how is it estimated?"


Follow-up Questions:

 "How does RMST compare to median survival or hazard ratio?"

 "What is the interpretation of RMST?

Interpretation: average time survived over a t0 period.

We can see that


- Only observed restricted survival times are included

- Inversely weight each individual by the probability of observing the survival time

Comparing Groups Using RMST

Main Question:

"How can RMST be used to compare treatment groups?"

Follow-up Questions:

 "What are the pros and cons of using RMST for treatment effect?"

 "What’s the difference between RMST difference and HR?"


Regression Using RMST: Pseudo-Values and IPCW

Main Question:

"How do we use regression to model RMST?"

Follow-up Questions:

 "What are pseudo-values?"

 "What is the IPCW approach?"

We can see that


- Only observed restricted survival times are included
- Inversely weight each individual by the probability of observing the survival time

Oral Exam Preparation for Week 11 (BIS 630)

Definition and Impact of Left Truncation

Main Question:

"What is left truncation, and how is it different from censoring?"

Follow-up Questions:

 "What are some real-life examples of left truncation?"

 "What is the consequence of not adjusting for left truncation?"

Answer Key Points:

 Left truncation: Subjects are only observed if their survival time is longer than a truncation point

(delayed entry).

 Censoring: Event is not observed during the study period, but subject is included.

 Left truncation causes sampling bias by excluding early events, biasing survival upwards.

 Example: Atomic bomb survivor study – no data for those who died before 1950 (5 years after

exposure).

Handling Left Truncation in Estimation

Main Question:

"How can survival estimation methods be adjusted for left truncation?"

Follow-up Questions:

 "How is the risk set modified in Cox regression with left truncation?"

 "Can Kaplan-Meier be used under left truncation?"

 Risk sets are not nested, and may increase over time (unlike right-censoring only setting).
5️⃣ Practical Considerations and Assumptions

Main Question:

"What assumptions are required for valid inference under left truncation?"

Follow-up Questions:

 "What happens if event time and entry time are dependent?"

 "How can baseline age be misused?"

Answer Key Points:

 Assumes independence between event time and truncation time, given covariates.

 Violation causes informative truncation, leading to bias.

 Baseline age should be used as truncation time if follow-up starts after a fixed age, not as a

covariate.
Oral Exam Preparation for Week 12 (BIS 630)

🧮 1. Sample Size Calculation in Survival Analysis

Main Question:

"How do we calculate the required sample size for a survival analysis study?"

Follow-up Questions:

 "Why do we need to consider the number of events, not just total sample size?"

 "What assumptions are commonly made when calculating sample size for survival analysis?"

 "What’s the role of the hazard ratio in this process?"

Answer Key Points:

 Sample size is based on number of events (deaths) rather than total subjects.

 Steps:

1. Define effect size: hazard ratio or change in median survival.

2. Calculate number of events using:

3. Adjust for the probability of observing an event to get total required sample size:

 Assumptions often include exponential distribution of survival times.

 Proportional hazards (PH) assumption is typically made.

 K-M estimates or parametric approximations (e.g., median survival time) may be used.

🔁 2. Dependent vs Independent Censoring

Main Question:

"What is dependent censoring, and how does it affect survival analysis?"

Follow-up Questions:
 "How can we detect dependent censoring?"

 "Why is it problematic?"

Answer Key Points:

 Independent censoring means censoring time is unrelated to event time.

 Dependent censoring violates this and can bias survival estimates.

 Detection: Plot risk scores (from survival model) vs. censoring scores (from censoring model).

o If correlated, dependence exists.

 Consequences:

o Positive correlation: overestimates survival

o Negative correlation: underestimates survival

3. Sensitivity Analysis for Dependent Censoring

Main Question:

"How can we check whether dependent censoring influences the result?"

Follow-up Questions:

 "What are the two extreme assumptions in sensitivity analysis?"

 "What would you conclude if the results stay consistent?"


Answer Key Points:

 Sensitivity analysis approaches:

1. Assume censored subjects have shorter survival → treat censored time as death time.

2. Assume censored subjects have longer survival → replace censored time with longest

observed time.

 If all results are similar, outcome is likely robust to dependent censoring.

⚖️4. Inverse Probability of Censoring Weighting (IPCW)

Main Question:

"What is IPCW and how is it used to address dependent censoring?"

Follow-up Questions:

 "How are the weights computed?"

 "How do we incorporate IPCW in a Cox model?"

Answer Key Points:

 IPCW adjusts for dependent censoring by re-weighting observed data.

 Steps:

1. Model the censoring process (censoring is the event).

2. Compute weights as inverse of estimated censoring survival probability G(t)G(t)G(t).

3. Fit Cox model with these weights:

coxph(Surv(start, stop, event) ~ covariates, weights = ..., data = ...)

 May use robust SE with cluster(id).

📄 5. Data Structure for IPCW or Time-dependent Covariates

Main Question:

"How should data be formatted to apply IPCW or fit time-dependent models?"

Follow-up Questions:

 "What is the (start, stop, status) format?"

 "Why is it needed?"
Answer Key Points:

 Use counting process format:

o Each subject’s follow-up divided into intervals where covariates and weights are constant.

o Format: (start, stop, event) — similar to time-dependent Cox.

 IPCW requires time-varying weights, so we need this format to apply the method correctly.

You might also like