PROC PHREG: Time-Dependent Covariates. notes n54703.016.5
In a survival analysis, a person's risk of death at time T in general
is a function of covariates at baseline, i.e. variables which are measured
at time t = 0. However, the person's risk status may change as a function
of time. A person may be not taking a drug (say, Vioxx) at baseline, but
begins taking it later on. Whether the person is taking the drug or not
is *time dependent*. The dose of the drug is also *time dependent*.
If you want to study the effect of a drug on risk of death, you will
obtain a more powerful analysis if you can incorporate the information regarding
whether the person was using the drug just before death into a time-dependent
covariate in PROC PHREG.
Because of the need to accommodate time-dependent covariates, PROC PHREG
has a feature that most other SAS procedures do not have. It is possible to
do some programming within the procedure itself.
This is illustrated in the analysis below. Here the objective is to
examine predictors of death in Lung Health Study participants. The participants
are classified into two strata, by gender. The first analysis is an ordinary
PROC LIFETEST, with the objective of producing a life-table graph (using PROC
GPLOT). This analysis is stratified by gender. The next analysis is carried
out using PROC PHREG, again with stratification by gender. The outcome variable
is time to death (or censoring) within 5 years of entering the study: folltim5.
Only fixed (non-time-dependent) covariates are entered into this analysis.
The third analysis uses PROC PHREG with one time-dependent covariate:
smoke. This is the person's smoking status at the time preceding the followup
time. smoke = 0 means not smoking, while smoke = 1 means that the person is
smoking. Below is the SAS code for these analyses, followed by the output.
The following is a link to the survival graph:
http://www.biostat.umn.edu/~john-c/5421/lhssurv.grf
=========================================================================================
options linesize = 80 ;
proc lifetest data = smoke outsurv = surcurve notable ;
time folltim5 * anydth5(0) ;
strata gender ;
test age f10cigs s2fevpos ;
title1 'Proc Lifetest: Lung Health Study Data on Survival, First 5 years' ;
title2 'Event: Death any cause. Stratifying variable: gender' ;
title3 'Risk factors: age, f10cigs, s2fevpos' ;
run ;
data surcurve ;
set surcurve ;
if _censor_ ne 0 then delete ;
s = survival ;
run ;
symbol1 i = steplj v = none c = black l = 1 w = 2 ;
symbol2 i = steplj v = none c = grey l = 1 w = 2 ;
PROC GPLOT DATA = SURCURVE ;
PLOT SURVIVAL * FOLLTIM5 = GENDER /
HAXIS = AXIS1 VAXIS = AXIS2 ;
AXIS1 VALUE = (F = SWISSB H = 3)
ORDER = 0 TO 6 BY 1
LABEL = (H = 3 'YEAR OF FOLLOWUP') ;
AXIS2 VALUE = (F = SWISS H = 3)
ORDER = .95 TO 1 BY .01
LABEL = (A = 90 H = 3 'Proportion Surviving') ;
TITLE1 H = 3.0 'Lung Health Study: Survival Through 5 Years' ;
TITLE2 H = 3.0 'By Gender' ;
format gender gender. ;
RUN ;
proc phreg data = smoke ;
model folltim5 * anydth5(0) = age f10cigs s2fevpos / rl ;
strata gender ;
title1 'PROC PHREG: Lung Health Study Data on Survival, First 5 years' ;
title2 'Event: Death any cause. Stratifying variable: gender' ;
title3 'Risk factors: age, f10cigs, s2fevpos' ;
run ;
proc phreg data = smoke ;
model folltim5 * anydth5(0) = age smoke s2fevpos / rl ;
strata gender ;
title1 'PROC PHREG: Lung Health Study Data on Survival, First 5 years' ;
title2 'Event: Death any cause. Stratifying variable: gender' ;
title3 'Risk factors: age, f10cigs, s2fevpos' ;
title4 '***** Time-dependent Covariate: smoke *****' ;
* The following section creates the time dependent variable smoke.
smoke = 1 ;
if folltim5 ge visa1yr and visa1yr ne . then do ;
if vpcquit1 ne . then smoke = 1 - vpcquit1 ; end ;
if folltim5 ge visa2yr and visa2yr ne . then do ;
if vpcquit2 ne . then smoke = 1 - vpcquit2 ; end ;
if folltim5 ge visa3yr and visa3yr ne . then do ;
if vpcquit3 ne . then smoke = 1 - vpcquit3 ; end ;
if folltim5 ge visa4yr and visa4yr ne . then do ;
if vpcquit4 ne . then smoke = 1 - vpcquit4 ; end ;
if folltim5 ge visa5yr and visa5yr ne . then do ;
if vpcquit5 ne . then smoke = 1 - vpcquit5 ; end ;
run;
endsas ;
=========================================================================================
Proc Lifetest: Lung Health Study Data on Survival, First 5 years 1
Event: Death any cause. Stratifying variable: gender
Risk factors: age, f10cigs, s2fevpos
20:03 Wednesday, April 13, 2005
The LIFETEST Procedure
Summary of the Number of Censored and Uncensored Values
GENDER Total Failed Censored %Censored
0 3702 101 3601 97.2717
1 2185 48 2137 97.8032
Total 5887 149 5738 97.4690
-------------------------------------------------------------------------------------
LUNG HEALTH STUDY : WBJEC5.SAS (JEC) 13APR05 20:03
Proc Lifetest: Lung Health Study Data on Survival, First 5 years 2
Event: Death any cause. Stratifying variable: gender
Risk factors: age, f10cigs, s2fevpos
20:03 Wednesday, April 13, 2005
The LIFETEST Procedure
Testing Homogeneity of Survival Curves over Strata
Time Variable FOLLTIM5
Rank Statistics
GENDER Log-Rank Wilcoxon
0 7.4228 43305
1 -7.4228 -43305
Covariance Matrix for the Log-Rank Statistics
GENDER 0 1
0 34.8067 -34.8067
1 -34.8067 34.8067
Covariance Matrix for the Wilcoxon Statistics
GENDER 0 1
0 1.1762E9 -1.176E9
1 -1.176E9 1.1762E9
Test of Equality over Strata
Pr >
Test Chi-Square DF Chi-Square
Log-Rank 1.5830 1 0.2083
Wilcoxon 1.5943 1 0.2067
-2Log(LR) 1.6024 1 0.2056
-------------------------------------------------------------------------------------
LUNG HEALTH STUDY : WBJEC5.SAS (JEC) 13APR05 20:03
Proc Lifetest: Lung Health Study Data on Survival, First 5 years 3
Event: Death any cause. Stratifying variable: gender
Risk factors: age, f10cigs, s2fevpos
20:03 Wednesday, April 13, 2005
The LIFETEST Procedure
Rank Tests for the Association of FOLLTIM5 with Covariates
Pooled over Strata
Univariate Chi-Squares for the WILCOXON Test
Test Standard Pr >
Variable Statistic Deviation Chi-Square Chi-Square Label
AGE -539.1 82.4450 42.7592 0.0001 AGE AT ENTRY INTO LHS
F10CIGS -66.3746 154.2 0.1854 0.6668 CIGS PER DAY AT SCREEN 1
S2FEVPOS 25.9902 5.4617 22.6450 0.0001 FEV1 POST-BD SCREEN 2
Covariance Matrix for the WILCOXON Statistics
Variable AGE F10CIGS S2FEVPOS
AGE 6797.2 -828.5 -221.4
F10CIGS -828.5 23765.9 -46.8
S2FEVPOS -221.4 -46.8 29.8
Forward Stepwise Sequence of Chi-Squares for the WILCOXON Test
Pr > Chi-Square Pr >
Variable DF Chi-Square Chi-Square Increment Increment
Label
AGE 1 42.7592 0.0001 42.7592 0.0001
AGE AT ENTRY INTO LHS
S2FEVPOS 2 45.9005 0.0001 3.1412 0.0763
FEV1 POST-BD SCREEN 2
F10CIGS 3 46.3676 0.0001 0.4671 0.4943
CIGS PER DAY AT SCREEN 1
Univariate Chi-Squares for the LOG RANK Test
Test Standard Pr >
Variable Statistic Deviation Chi-Square Chi-Square Label
AGE -547.2 83.4320 43.0106 0.0001 AGE AT ENTRY INTO LHS
F10CIGS -66.1891 156.2 0.1795 0.6718 CIGS PER DAY AT SCREEN 1
S2FEVPOS 26.4023 5.5231 22.8517 0.0001 FEV1 POST-BD SCREEN 2
-------------------------------------------------------------------------------------
LUNG HEALTH STUDY : WBJEC5.SAS (JEC) 13APR05 20:03
Proc Lifetest: Lung Health Study Data on Survival, First 5 years 4
Event: Death any cause. Stratifying variable: gender
Risk factors: age, f10cigs, s2fevpos
20:03 Wednesday, April 13, 2005
The LIFETEST Procedure
Covariance Matrix for the LOG RANK Statistics
Variable AGE F10CIGS S2FEVPOS
AGE 6960.9 -854.1 -225.9
F10CIGS -854.1 24410.9 -47.9
S2FEVPOS -225.9 -47.9 30.5
Forward Stepwise Sequence of Chi-Squares for the LOG RANK Test
Pr > Chi-Square Pr >
Variable DF Chi-Square Chi-Square Increment Increment
Label
AGE 1 43.0106 0.0001 43.0106 0.0001
AGE AT ENTRY INTO LHS
S2FEVPOS 2 46.2380 0.0001 3.2274 0.0724
FEV1 POST-BD SCREEN 2
F10CIGS 3 46.6972 0.0001 0.4592 0.4980
CIGS PER DAY AT SCREEN 1
-------------------------------------------------------------------------------------
LUNG HEALTH STUDY : WBJEC5.SAS (JEC) 13APR05 20:03
PROC PHREG: Lung Health Study Data on Survival, First 5 years 5
Event: Death any cause. Stratifying variable: gender
Risk factors: age, f10cigs, s2fevpos
20:03 Wednesday, April 13, 2005
The PHREG Procedure
Data Set: WORK.SMOKE
Dependent Variable: FOLLTIM5
Censoring Variable: ANYDTH5 DEATH ANY CAUSE BY YEAR 5
Censoring Value(s): 0
Ties Handling: BRESLOW
Summary of the Number of Event and Censored Values
Percent
Stratum GENDER Total Event Censored Censored
1 0 3701 101 3600 97.27
2 1 2184 48 2136 97.80
-------------------------------------------------------------------
Total 5885 149 5736 97.47
Testing Global Null Hypothesis: BETA=0
Without With
Criterion Covariates Covariates Model Chi-Square
-2 LOG L 2394.048 2343.900 50.148 with 3 DF (p=0.0001)
Score . . 46.697 with 3 DF (p=0.0001)
Wald . . 44.211 with 3 DF (p=0.0001)
Analysis of Maximum Likelihood Estimates
Parameter Standard Wald Pr >
Variable DF Estimate Error Chi-Square Chi-Square
AGE 1 0.077332 0.01548 24.94053 0.0001
F10CIGS 1 0.005427 0.00627 0.75003 0.3865
S2FEVPOS 1 -0.383411 0.21296 3.24151 0.0718
-------------------------------------------------------------------------------------
LUNG HEALTH STUDY : WBJEC5.SAS (JEC) 13APR05 20:03
PROC PHREG: Lung Health Study Data on Survival, First 5 years 6
Event: Death any cause. Stratifying variable: gender
Risk factors: age, f10cigs, s2fevpos
20:03 Wednesday, April 13, 2005
The PHREG Procedure
Analysis of Maximum Likelihood Estimates
Conditional Risk Ratio and
95% Confidence Limits
Risk
Variable Ratio Lower Upper Label
AGE 1.080 1.048 1.114 AGE AT ENTRY INTO LHS
F10CIGS 1.005 0.993 1.018 CIGS PER DAY AT SCREEN 1
S2FEVPOS 0.682 0.449 1.035 FEV1 POST-BD SCREEN 2
-------------------------------------------------------------------------------------
LUNG HEALTH STUDY : WBJEC5.SAS (JEC) 13APR05 20:03
PROC PHREG: Lung Health Study Data on Survival, First 5 years 7
Event: Death any cause. Stratifying variable: gender
Risk factors: age, f10cigs, s2fevpos
***** Time-dependent Covariate: smoke *****
20:03 Wednesday, April 13, 2005
The PHREG Procedure
Data Set: WORK.SMOKE
Dependent Variable: FOLLTIM5
Censoring Variable: ANYDTH5 DEATH ANY CAUSE BY YEAR 5
Censoring Value(s): 0
Ties Handling: BRESLOW
Summary of the Number of Event and Censored Values
Percent
Stratum GENDER Total Event Censored Censored
1 0 3701 89 3612 97.60
2 1 2184 40 2144 98.17
-------------------------------------------------------------------
Total 5885 129 5756 97.81
Testing Global Null Hypothesis: BETA=0
Without With
Criterion Covariates Covariates Model Chi-Square
-2 LOG L 2066.120 2020.451 45.668 with 3 DF (p=0.0001)
Score . . 42.993 with 3 DF (p=0.0001)
Wald . . 40.937 with 3 DF (p=0.0001)
Analysis of Maximum Likelihood Estimates
Parameter Standard Wald Pr >
Variable DF Estimate Error Chi-Square Chi-Square
AGE 1 0.073518 0.01641 20.07685 0.0001
SMOKE 1 0.386182 0.21113 3.34557 0.0674
S2FEVPOS 1 -0.425088 0.22688 3.51060 0.0610
-------------------------------------------------------------------------------------
LUNG HEALTH STUDY : WBJEC5.SAS (JEC) 13APR05 20:03
PROC PHREG: Lung Health Study Data on Survival, First 5 years 8
Event: Death any cause. Stratifying variable: gender
Risk factors: age, f10cigs, s2fevpos
***** Time-dependent Covariate: smoke *****
20:03 Wednesday, April 13, 2005
The PHREG Procedure
Analysis of Maximum Likelihood Estimates
Conditional Risk Ratio and
95% Confidence Limits
Risk
Variable Ratio Lower Upper Label
AGE 1.076 1.042 1.111 AGE AT ENTRY INTO LHS
SMOKE 1.471 0.973 2.226
S2FEVPOS 0.654 0.419 1.020 FEV1 POST-BD SCREEN 2
LUNG HEALTH STUDY : WBJEC5.SAS (JEC) 13APR05 20:03
=========================================================================================
Note the section of SAS code in the second PROC PHREG procedure where
the time-dependent covariate 'smoke' is defined:
-------------------------------------------------------------------------------------
* The following section creates the time dependent variable smoke.
smoke = 1 ;
if folltim5 ge visa1yr and visa1yr ne . then do ;
if vpcquit1 ne . then smoke = 1 - vpcquit1 ; end ;
if folltim5 ge visa2yr and visa2yr ne . then do ;
if vpcquit2 ne . then smoke = 1 - vpcquit2 ; end ;
if folltim5 ge visa3yr and visa3yr ne . then do ;
if vpcquit3 ne . then smoke = 1 - vpcquit3 ; end ;
if folltim5 ge visa4yr and visa4yr ne . then do ;
if vpcquit4 ne . then smoke = 1 - vpcquit4 ; end ;
if folltim5 ge visa5yr and visa5yr ne . then do ;
if vpcquit5 ne . then smoke = 1 - vpcquit5 ; end ;
run;
-------------------------------------------------------------------------------------
What is being done in the program is the following. At each event time,
the time-dependent covariate 'smoke' for the person who died and those for all the
other people who were still alive at that time is computed. The value of
'smoke' is the value that was measured at the annual visit which preceded the
date of death. The times of these annual visits are represented by the
variables visa1yr (time in years of annual visit 1), visa2yr, visa3yr, etc..
There are two subtleties to this that should be noted. The variable 'smoke'
is given the value '1' initially, because everyone in the Lung Health Study was
a smoker at the time of the first screening visit. The value of 'smoke' is then
reset to a new value at annual visits. But note that if a given annual visit
variable is missing (e.g., visa3yr = .) then the value of 'smoke' is not changed
from the value it had at the previous annual visit. This is an instance of
*imputing* a value for a variable whose actual value is missing. In this case
we are imputing the value that the variable 'smoke' had at the time of the
previous nonmissing annual visit.
Note that in this analysis, the hazard ratio for the variable 'smoke'
is estimated to be 1.47. The fact that this estimate is larger than 1.00 is
consistent with the hypothesis that smoking increases the risk of death.
The effect does not quite attain the hallowed 0.05 level of statistical
significance: p = 0.0674.
=========================================================================================
n54703.016.5 Last update: April 13, 2005.