week4_3assumption_NOT

get data

library(survival)
library(car)
g <- read.csv("GMPH_1K_final_2.csv", header = TRUE, sep = ",")
g %>% dim()

## [1] 1000   31

# g %>% head()
g$gender <- as.factor(g$gender)
g$copd <- as.factor(g$copd)
g$ethnicgroup <- as.factor(g$ethnicgroup)
levels(g$ethnicgroup) <- c(levels(g$ethnicgroup), "8")
g$ethnicgroup[is.na(g$ethnicgroup)] <- "8"
table(g$ethnicgroup, exclude = NULL)

## 
##   1   2   3   9   8 
## 889  17  34  17  43

# set quintile 0 to NAs, coxph will ignore these patients
g <- g %>% mutate(quintile = recode(quintile, "c('0')=NA"))
table(g$quintile, exclude = NULL)

## 
##    1    2    3    4    5 <NA> 
##  138  205  211  220  216   10

g$quintile <- as.factor(g$quintile)

what does NON-PH mean?

example: gender
- if relation btw male vs female risk of death changes over time,
  - it could mean that males have higher risk of death early on,
  - then relation changes, and females have higher risk of death later on
- there is interaction btw gender and time
  - { can you have interaction btw variable and OUTCOME??? }
- all you need to do is to add an interaction term btw gender and time {…}
  - { need to find a dataset with actual interaction,
  - and plot out KM curves with and without interaction and see for myself… }
testing if interaction term is statistically significant is another way of testing PH assumption
- if interaction term not stat sig, then PH assumption is valid
- should get p-value and plots
for mathematically reasons, cannot just include follow-up time itself,
- need to transform it with tt() time-transform function

# model with just gender
cox <- coxph(Surv(fu_time, death) ~ gender, data = g)
summary(cox)

## Call:
## coxph(formula = Surv(fu_time, death) ~ gender, data = g)
## 
##   n= 1000, number of events= 492 
## 
##            coef exp(coef) se(coef)     z Pr(>|z|)
## gender2 0.02571   1.02604  0.09082 0.283    0.777
## 
##         exp(coef) exp(-coef) lower .95 upper .95
## gender2     1.026     0.9746    0.8587     1.226
## 
## Concordance= 0.497  (se = 0.013 )
## Likelihood ratio test= 0.08  on 1 df,   p=0.8
## Wald test            = 0.08  on 1 df,   p=0.8
## Score (logrank) test = 0.08  on 1 df,   p=0.8

# model with gender and gender * fu_time
cox_int <- coxph(Surv(fu_time, death) ~ gender + tt(gender), data = g)
summary(cox_int)

## Call:
## coxph(formula = Surv(fu_time, death) ~ gender + tt(gender), data = g)
## 
##   n= 1000, number of events= 492 
## 
##               coef exp(coef) se(coef)      z Pr(>|z|)
## gender2     0.6405    1.8974   0.9800  0.654    0.513
## tt(gender) -0.2003    0.8185   0.3182 -0.629    0.529
## 
##            exp(coef) exp(-coef) lower .95 upper .95
## gender2       1.8974      0.527    0.2779    12.953
## tt(gender)    0.8185      1.222    0.4387     1.527
## 
## Concordance= 0.497  (se = 0.013 )
## Likelihood ratio test= 0.49  on 2 df,   p=0.8
## Wald test            = 0.48  on 2 df,   p=0.8
## Score (logrank) test = 0.48  on 2 df,   p=0.8

p-value of 0.529 for tt(gender) is not statistically significant, good news,
PH assumption is valid, no interaction between variable and time
i.e. time does NOT affect gender’s impact on risk of death

scenario 2

if p-value is low, but hazards looks proportional for most the the follow-up period
can divide survival analysis into 2 time periods
- fit one model without interaction for when PH is valid
- fit another model with interaction for when PH is invalid

scenario 3

stratify by variable
e.g. separate models for male and females
disadvantage = can no longer assess effect of gender on mortality

week4_3assumption_NOT_met

Hippo

2020_05_25_13

get data

what does NON-PH mean?

scenario 2

scenario 3

EOF