Start Stop Coding of Time Dependent Covariates

We discuss how to code time dependent covariates for use in the coxph function. This is part of Example 9.1 from the textbook.

library(survival)
library(KMsurv)
library(ggplot2)
library(ggfortify)

We follow example 9.1 from the textbook. First we put the data in start, stop form.

data(bmt)
bmt <- bmt[,c("group","t2","d3","tp","dp")]
disease <- c("ALL","AML Low","AML High")
bmt$group <- as.factor(disease[bmt$group])

bmt_sub1 <- bmt[bmt$dp==1,c("group","t2","d3","tp")]
bmt_sub1$tp_start <- 0
bmt_sub1$tp_stop <- bmt_sub1$tp
bmt_sub1$tp <- 0
bmt_sub1$d3 <- 0
bmt_sub1 <- bmt_sub1[bmt_sub1$tp_stop>0,]

bmt_sub2 <- bmt[bmt$dp==1,c("group","t2","d3","tp")]
bmt_sub2$tp_start <- bmt_sub2$tp
bmt_sub2$tp_stop <- bmt_sub2$t2
bmt_sub2$tp <- 1

bmt_sub3 <- bmt[bmt$dp==0,c("group","t2","d3","tp")]
bmt_sub3$tp_start <- 0
bmt_sub3$tp_stop <- bmt_sub3$t2
bmt_sub3$tp <- 0


bmt_sub <- rbind(bmt_sub1,bmt_sub2,bmt_sub3)

Then we can fit the Cox PH model as usual.

fit <- coxph(Surv(tp_start,tp_stop,d3)~group+tp,data=bmt_sub)
summary(fit)
## Call:
## coxph(formula = Surv(tp_start, tp_stop, d3) ~ group + tp, data = bmt_sub)
## 
##   n= 256, number of events= 83 
## 
##                  coef exp(coef) se(coef)      z Pr(>|z|)    
## groupAML High  0.3818    1.4650   0.2676  1.427 0.153626    
## groupAML Low  -0.4971    0.6083   0.2892 -1.719 0.085656 .  
## tp            -1.1202    0.3262   0.3292 -3.403 0.000666 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##               exp(coef) exp(-coef) lower .95 upper .95
## groupAML High    1.4650     0.6826    0.8670    2.4753
## groupAML Low     0.6083     1.6440    0.3451    1.0723
## tp               0.3262     3.0656    0.1711    0.6218
## 
## Concordance= 0.671  (se = 0.027 )
## Rsquare= 0.085   (max possible= 0.946 )
## Likelihood ratio test= 22.87  on 3 df,   p=4e-05
## Wald test            = 25.56  on 3 df,   p=1e-05
## Score (logrank) test = 27.94  on 3 df,   p=4e-06

Time Dependent Covariates for Catheter Example

We use time dependent covariates to improve the fit of Cox PH. This is part of Example 9.2 in the textbook. First we load the kidney data and fit a Cox PH model.

data(kidney,package="KMsurv")
head(kidney)
##   time delta type
## 1  1.5     1    1
## 2  3.5     1    1
## 3  4.5     1    1
## 4  4.5     1    1
## 5  5.5     1    1
## 6  8.5     1    1
kidney$type <- kidney$type - 1 ## match coding with book
fit <- coxph(Surv(time,delta)~type,data=kidney)
fit
## Call:
## coxph(formula = Surv(time, delta) ~ type, data = kidney)
## 
##         coef exp(coef) se(coef)      z     p
## type -0.6126    0.5420   0.3979 -1.539 0.124
## 
## Likelihood ratio test=2.41  on 1 df, p=0.1207
## n= 119, number of events= 26

In class on March 1 (“Cox PH Breslow Partial Likelihood, Local and Global Tests”) we showed the Cox PH model does not fit well to this data. We reached this conclusion by examining the difference in log cumulative hazard functions fit with the Nelson-Aalan estimator. The resulting plot below (Figure 8.1 in textbook) is not constant, suggesting a poor model fit.

This can also be seen by examining Kaplan Meier plots for the two groups.

fitkm <- survfit(Surv(time,delta)~type,data=kidney)
autoplot(fitkm,xlab="Survival Time",ylab="Survival")

The Cox PH model assumes that catheter placement increases or decreases hazard across all times at a constant ratio. However catheter placement may increase hazard initially, e.g. patient is more/less likely to get infection, but then have no effect (or opposite effect) on outcome. We can model this relationship using time-dependent covariates.

Coding for Catheter Example

head(kidney)
##   time delta type
## 1  1.5     1    0
## 2  3.5     1    0
## 3  4.5     1    0
## 4  4.5     1    0
## 5  5.5     1    0
## 6  8.5     1    0
tau <- 3.5

ksub1 <- kidney[kidney$time <= tau,]
ksub1 <- data.frame(start=0,stop=ksub1$time,del=ksub1$delta,z2=0,z3=ksub1$type)

ksub2 <- kidney[kidney$time > tau,]
ksub2p <- data.frame(start=rep(0,nrow(ksub2)),stop=tau,
                     del=0,z2=0,z3=ksub2$type)
ksub2 <- data.frame(start=tau,stop=ksub2$time,
                    del=ksub2$delta,z2=ksub2$type,z3=0)

dat <- rbind(ksub1,ksub2p,ksub2)

fit <-coxph(Surv(start,stop,del)~z2+z3,data=dat,method="breslow")
summary(fit)
## Call:
## coxph(formula = Surv(start, stop, del) ~ z2 + z3, data = dat, 
##     method = "breslow")
## 
##   n= 198, number of events= 26 
## 
##       coef exp(coef) se(coef)      z Pr(>|z|)   
## z2 -2.0889    0.1238   0.7597 -2.750  0.00597 **
## z3  1.0818    2.9498   0.7832  1.381  0.16722   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##    exp(coef) exp(-coef) lower .95 upper .95
## z2    0.1238      8.076   0.02793    0.5489
## z3    2.9498      0.339   0.63553   13.6918
## 
## Concordance= 0.651  (se = 0.041 )
## Rsquare= 0.068   (max possible= 0.652 )
## Likelihood ratio test= 13.9  on 2 df,   p=0.001
## Wald test            = 9.47  on 2 df,   p=0.009
## Score (logrank) test = 12.83  on 2 df,   p=0.002

The above results match Table 9.6 in text.