PROC PHREG: Time-Dependent Covariates.                  notes n54703.016.5

     In a survival analysis, a person's risk of death at time T in general
is a function of covariates at baseline, i.e. variables which are measured
at time t = 0.  However, the person's risk status may change as a function
of time.  A person may be not taking a drug (say, Vioxx) at baseline, but
begins taking it later on.  Whether the person is taking the drug or not
is *time dependent*.  The dose of the drug is also *time dependent*.

     If you want to study the effect of a drug on risk of death, you will
obtain a more powerful analysis if you can incorporate the information regarding
whether the person was using the drug just before death into a time-dependent
covariate in PROC PHREG.

     Because of the need to accommodate time-dependent covariates, PROC PHREG
has a feature that most other SAS procedures do not have.  It is possible to
do some programming within the procedure itself.

     This is illustrated in the analysis below.  Here the objective is to
examine predictors of death in Lung Health Study participants.  The participants
are classified into two strata, by gender.  The first analysis is an ordinary
PROC LIFETEST, with the objective of producing a life-table graph (using PROC
GPLOT).  This analysis is stratified by gender.   The next analysis is carried
out using PROC PHREG, again with stratification by gender.  The outcome variable
is time to death (or censoring) within 5 years of entering the study: folltim5.
Only fixed (non-time-dependent) covariates are entered into this analysis.

     The third analysis uses PROC PHREG with one time-dependent covariate:
smoke.  This is the person's smoking status at the time preceding the followup
time.  smoke = 0 means not smoking, while smoke = 1 means that the person is
smoking.  Below is the SAS code for these analyses, followed by the output.

     The following is a link to the survival graph:

     http://www.biostat.umn.edu/~john-c/5421/lhssurv.grf

=========================================================================================

options linesize = 80 ;

proc lifetest data = smoke  outsurv = surcurve notable ;
     time folltim5 * anydth5(0) ;
strata gender ;
test age f10cigs s2fevpos ;
title1 'Proc Lifetest: Lung Health Study Data on Survival, First 5 years' ;
title2 'Event: Death any cause.  Stratifying variable: gender' ;
title3 'Risk factors: age, f10cigs, s2fevpos' ;
run ;

data surcurve ;
     set surcurve ;
     if _censor_ ne 0 then delete ;
     s = survival ;
run ;

symbol1 i = steplj v = none c = black l = 1 w = 2 ;
symbol2 i = steplj v = none c = grey  l = 1 w = 2 ;

 PROC GPLOT DATA = SURCURVE ;
      PLOT SURVIVAL * FOLLTIM5 = GENDER /
           HAXIS = AXIS1  VAXIS = AXIS2 ;
           AXIS1 VALUE = (F = SWISSB H = 3)
             ORDER = 0 TO 6 BY 1
             LABEL = (H = 3 'YEAR OF FOLLOWUP') ;
           AXIS2 VALUE = (F = SWISS  H = 3)
             ORDER = .95 TO 1 BY .01
             LABEL = (A = 90 H = 3 'Proportion Surviving') ;
 TITLE1 H = 3.0 'Lung Health Study: Survival Through 5 Years' ;
 TITLE2 H = 3.0 'By Gender' ;
 format gender gender. ;
 RUN ;

proc phreg data = smoke ;
     model folltim5 * anydth5(0) = age f10cigs s2fevpos / rl ;
strata gender ;
title1 'PROC PHREG: Lung Health Study Data on Survival, First 5 years' ;
title2 'Event: Death any cause.  Stratifying variable: gender' ;
title3 'Risk factors: age, f10cigs, s2fevpos' ;
run ;

proc phreg data = smoke ;
     model folltim5 * anydth5(0) = age smoke s2fevpos / rl ;
strata gender ;
title1 'PROC PHREG: Lung Health Study Data on Survival, First 5 years' ;
title2 'Event: Death any cause.  Stratifying variable: gender' ;
title3 'Risk factors: age, f10cigs, s2fevpos' ;
title4 '*****     Time-dependent Covariate: smoke     *****' ;

*  The following section creates the time dependent variable smoke.

smoke = 1 ;

if folltim5 ge visa1yr and visa1yr ne . then do ;
   if vpcquit1 ne . then smoke = 1 - vpcquit1 ; end ;

if folltim5 ge visa2yr and visa2yr ne . then do ;
   if vpcquit2 ne . then smoke = 1 - vpcquit2 ; end ;

if folltim5 ge visa3yr and visa3yr ne . then do ;
   if vpcquit3 ne . then smoke = 1 - vpcquit3 ; end ;

if folltim5 ge visa4yr and visa4yr ne . then do ;
   if vpcquit4 ne . then smoke = 1 - vpcquit4 ; end ;

if folltim5 ge visa5yr and visa5yr ne . then do ;
   if vpcquit5 ne . then smoke = 1 - vpcquit5 ; end ;

run;

endsas ;
=========================================================================================

        Proc Lifetest: Lung Health Study Data on Survival, First 5 years       1
             Event: Death any cause.  Stratifying variable: gender
                      Risk factors: age, f10cigs, s2fevpos
                                                 20:03 Wednesday, April 13, 2005

                             The LIFETEST Procedure

            Summary of the Number of Censored and Uncensored Values

              GENDER      Total     Failed   Censored  %Censored

              0            3702        101       3601    97.2717
              1            2185         48       2137    97.8032

              Total        5887        149       5738    97.4690
 
 
-------------------------------------------------------------------------------------
              LUNG HEALTH STUDY :  WBJEC5.SAS (JEC) 13APR05 20:03
        Proc Lifetest: Lung Health Study Data on Survival, First 5 years       2
             Event: Death any cause.  Stratifying variable: gender
                      Risk factors: age, f10cigs, s2fevpos
                                                 20:03 Wednesday, April 13, 2005

                             The LIFETEST Procedure

               Testing Homogeneity of Survival Curves over Strata
                             Time Variable FOLLTIM5


                                Rank Statistics

                       GENDER       Log-Rank     Wilcoxon

                       0              7.4228        43305
                       1             -7.4228       -43305


                 Covariance Matrix for the Log-Rank Statistics

                      GENDER               0             1

                      0              34.8067      -34.8067
                      1             -34.8067       34.8067


                 Covariance Matrix for the Wilcoxon Statistics

                      GENDER               0             1

                      0             1.1762E9      -1.176E9
                      1             -1.176E9      1.1762E9


                          Test of Equality over Strata

                                                    Pr >
                     Test      Chi-Square    DF  Chi-Square

                     Log-Rank      1.5830     1      0.2083
                     Wilcoxon      1.5943     1      0.2067
                     -2Log(LR)     1.6024     1      0.2056

-------------------------------------------------------------------------------------
 
              LUNG HEALTH STUDY :  WBJEC5.SAS (JEC) 13APR05 20:03
        Proc Lifetest: Lung Health Study Data on Survival, First 5 years       3
             Event: Death any cause.  Stratifying variable: gender
                      Risk factors: age, f10cigs, s2fevpos
                                                 20:03 Wednesday, April 13, 2005

                             The LIFETEST Procedure

           Rank Tests for the Association of FOLLTIM5 with Covariates
                               Pooled over Strata


                  Univariate Chi-Squares for the WILCOXON Test

             Test     Standard                 Pr >
Variable  Statistic  Deviation  Chi-Square  Chi-Square  Label

AGE          -539.1    82.4450    42.7592     0.0001    AGE AT ENTRY INTO LHS   
F10CIGS    -66.3746      154.2     0.1854     0.6668    CIGS PER DAY AT SCREEN 1
S2FEVPOS    25.9902     5.4617    22.6450     0.0001    FEV1 POST-BD SCREEN 2   



                 Covariance Matrix for the WILCOXON Statistics

               Variable           AGE       F10CIGS      S2FEVPOS

               AGE             6797.2        -828.5        -221.4
               F10CIGS         -828.5       23765.9         -46.8
               S2FEVPOS        -221.4         -46.8          29.8


         Forward Stepwise Sequence of Chi-Squares for the WILCOXON Test

                                         Pr >       Chi-Square       Pr >
    Variable      DF    Chi-Square    Chi-Square     Increment    Increment
    Label

    AGE            1      42.7592       0.0001        42.7592       0.0001 
    AGE AT ENTRY INTO LHS   
    S2FEVPOS       2      45.9005       0.0001         3.1412       0.0763 
    FEV1 POST-BD SCREEN 2   
    F10CIGS        3      46.3676       0.0001         0.4671       0.4943 
    CIGS PER DAY AT SCREEN 1



                  Univariate Chi-Squares for the LOG RANK Test

             Test     Standard                 Pr >
Variable  Statistic  Deviation  Chi-Square  Chi-Square  Label

AGE          -547.2    83.4320    43.0106     0.0001    AGE AT ENTRY INTO LHS   
F10CIGS    -66.1891      156.2     0.1795     0.6718    CIGS PER DAY AT SCREEN 1
S2FEVPOS    26.4023     5.5231    22.8517     0.0001    FEV1 POST-BD SCREEN 2   

 
 
-------------------------------------------------------------------------------------
              LUNG HEALTH STUDY :  WBJEC5.SAS (JEC) 13APR05 20:03
        Proc Lifetest: Lung Health Study Data on Survival, First 5 years       4
             Event: Death any cause.  Stratifying variable: gender
                      Risk factors: age, f10cigs, s2fevpos
                                                 20:03 Wednesday, April 13, 2005

                             The LIFETEST Procedure

                 Covariance Matrix for the LOG RANK Statistics

               Variable           AGE       F10CIGS      S2FEVPOS

               AGE             6960.9        -854.1        -225.9
               F10CIGS         -854.1       24410.9         -47.9
               S2FEVPOS        -225.9         -47.9          30.5


         Forward Stepwise Sequence of Chi-Squares for the LOG RANK Test

                                         Pr >       Chi-Square       Pr >
    Variable      DF    Chi-Square    Chi-Square     Increment    Increment
    Label

    AGE            1      43.0106       0.0001        43.0106       0.0001 
    AGE AT ENTRY INTO LHS   
    S2FEVPOS       2      46.2380       0.0001         3.2274       0.0724 
    FEV1 POST-BD SCREEN 2   
    F10CIGS        3      46.6972       0.0001         0.4592       0.4980 
    CIGS PER DAY AT SCREEN 1

-------------------------------------------------------------------------------------
              LUNG HEALTH STUDY :  WBJEC5.SAS (JEC) 13APR05 20:03
         PROC PHREG: Lung Health Study Data on Survival, First 5 years         5
             Event: Death any cause.  Stratifying variable: gender
                      Risk factors: age, f10cigs, s2fevpos
                                                 20:03 Wednesday, April 13, 2005

                              The PHREG Procedure

     Data Set: WORK.SMOKE
     Dependent Variable: FOLLTIM5
     Censoring Variable: ANYDTH5   DEATH ANY CAUSE BY YEAR 5
     Censoring Value(s): 0  
     Ties Handling: BRESLOW 


              Summary of the Number of Event and Censored Values
 
                                                                  Percent
      Stratum    GENDER         Total       Event    Censored    Censored

            1    0               3701         101        3600       97.27
            2    1               2184          48        2136       97.80
      -------------------------------------------------------------------
        Total                    5885         149        5736       97.47


                     Testing Global Null Hypothesis: BETA=0
 
                   Without        With   
    Criterion    Covariates    Covariates    Model Chi-Square

    -2 LOG L       2394.048      2343.900      50.148 with 3 DF (p=0.0001)  
    Score              .             .         46.697 with 3 DF (p=0.0001)  
    Wald               .             .         44.211 with 3 DF (p=0.0001)  


                    Analysis of Maximum Likelihood Estimates
 
                         Parameter      Standard       Wald          Pr >   
    Variable    DF        Estimate        Error     Chi-Square    Chi-Square

    AGE          1        0.077332       0.01548      24.94053        0.0001
    F10CIGS      1        0.005427       0.00627       0.75003        0.3865
    S2FEVPOS     1       -0.383411       0.21296       3.24151        0.0718

-------------------------------------------------------------------------------------
 
              LUNG HEALTH STUDY :  WBJEC5.SAS (JEC) 13APR05 20:03
         PROC PHREG: Lung Health Study Data on Survival, First 5 years         6
             Event: Death any cause.  Stratifying variable: gender
                      Risk factors: age, f10cigs, s2fevpos
                                                 20:03 Wednesday, April 13, 2005

                              The PHREG Procedure

                    Analysis of Maximum Likelihood Estimates
 
                   Conditional Risk Ratio and
                      95% Confidence Limits
 
                    Risk
    Variable       Ratio       Lower       Upper    Label

    AGE            1.080       1.048       1.114    AGE AT ENTRY INTO LHS   
    F10CIGS        1.005       0.993       1.018    CIGS PER DAY AT SCREEN 1
    S2FEVPOS       0.682       0.449       1.035    FEV1 POST-BD SCREEN 2   
 
-------------------------------------------------------------------------------------
              LUNG HEALTH STUDY :  WBJEC5.SAS (JEC) 13APR05 20:03
         PROC PHREG: Lung Health Study Data on Survival, First 5 years         7
             Event: Death any cause.  Stratifying variable: gender
                      Risk factors: age, f10cigs, s2fevpos
              *****     Time-dependent Covariate: smoke     *****
                                                 20:03 Wednesday, April 13, 2005

                              The PHREG Procedure

     Data Set: WORK.SMOKE
     Dependent Variable: FOLLTIM5
     Censoring Variable: ANYDTH5   DEATH ANY CAUSE BY YEAR 5
     Censoring Value(s): 0  
     Ties Handling: BRESLOW 


              Summary of the Number of Event and Censored Values
 
                                                                  Percent
      Stratum    GENDER         Total       Event    Censored    Censored

            1    0               3701          89        3612       97.60
            2    1               2184          40        2144       98.17
      -------------------------------------------------------------------
        Total                    5885         129        5756       97.81


                     Testing Global Null Hypothesis: BETA=0
 
                   Without        With   
    Criterion    Covariates    Covariates    Model Chi-Square

    -2 LOG L       2066.120      2020.451      45.668 with 3 DF (p=0.0001)  
    Score              .             .         42.993 with 3 DF (p=0.0001)  
    Wald               .             .         40.937 with 3 DF (p=0.0001)  


                    Analysis of Maximum Likelihood Estimates
 
                         Parameter      Standard       Wald          Pr >   
    Variable    DF        Estimate        Error     Chi-Square    Chi-Square

    AGE          1        0.073518       0.01641      20.07685        0.0001
    SMOKE        1        0.386182       0.21113       3.34557        0.0674
    S2FEVPOS     1       -0.425088       0.22688       3.51060        0.0610

-------------------------------------------------------------------------------------
              LUNG HEALTH STUDY :  WBJEC5.SAS (JEC) 13APR05 20:03
         PROC PHREG: Lung Health Study Data on Survival, First 5 years         8
             Event: Death any cause.  Stratifying variable: gender
                      Risk factors: age, f10cigs, s2fevpos
              *****     Time-dependent Covariate: smoke     *****
                                                 20:03 Wednesday, April 13, 2005

                              The PHREG Procedure

                  Analysis of Maximum Likelihood Estimates
 
                   Conditional Risk Ratio and
                      95% Confidence Limits
 
                    Risk
    Variable       Ratio       Lower       Upper    Label

    AGE            1.076       1.042       1.111    AGE AT ENTRY INTO LHS
    SMOKE          1.471       0.973       2.226                         
    S2FEVPOS       0.654       0.419       1.020    FEV1 POST-BD SCREEN 2
 
 
              LUNG HEALTH STUDY :  WBJEC5.SAS (JEC) 13APR05 20:03

=========================================================================================

     Note the section of SAS code in the second PROC PHREG procedure where
the time-dependent covariate 'smoke' is defined:

-------------------------------------------------------------------------------------
*  The following section creates the time dependent variable smoke.

smoke = 1 ;

if folltim5 ge visa1yr and visa1yr ne . then do ;
   if vpcquit1 ne . then smoke = 1 - vpcquit1 ; end ;

if folltim5 ge visa2yr and visa2yr ne . then do ;
   if vpcquit2 ne . then smoke = 1 - vpcquit2 ; end ;

if folltim5 ge visa3yr and visa3yr ne . then do ;
   if vpcquit3 ne . then smoke = 1 - vpcquit3 ; end ;

if folltim5 ge visa4yr and visa4yr ne . then do ;
   if vpcquit4 ne . then smoke = 1 - vpcquit4 ; end ;

if folltim5 ge visa5yr and visa5yr ne . then do ;
   if vpcquit5 ne . then smoke = 1 - vpcquit5 ; end ;

run;

-------------------------------------------------------------------------------------

     What is being done in the program is the following.  At each event time,
the time-dependent covariate 'smoke' for the person who died and those for all the 
other people who were still alive at that time is computed.  The value of
'smoke' is the value that was measured at the annual visit which preceded the
date of death.  The times of these annual visits are represented by the
variables visa1yr (time in years of annual visit 1), visa2yr, visa3yr, etc..

     There are two subtleties to this that should be noted.  The variable 'smoke'
is given the value '1' initially, because everyone in the Lung Health Study was
a smoker at the time of the first screening visit.  The value of 'smoke' is then
reset to a new value at annual visits.  But note that if a given annual visit
variable is missing (e.g., visa3yr = .) then the value of 'smoke' is not changed
from the value it had at the previous annual visit.  This is an instance of
*imputing* a value for a variable whose actual value is missing.  In this case
we are imputing the value that the variable 'smoke' had at the time of the
previous nonmissing annual visit.

     Note that in this analysis, the hazard ratio for the variable 'smoke'
is estimated to be 1.47.  The fact that this estimate is larger than 1.00 is
consistent with the hypothesis that smoking increases the risk of death.
The effect does not quite attain the hallowed 0.05 level of statistical 
significance:  p = 0.0674.

=========================================================================================
n54703.016.5  Last update: April 13, 2005.