SURVIVAL ANALYSIS, II: PROC LIFETEST, contin.                    n54703.015

     Plots are an important part of life-table analysis.  The line-printer plot
given in n54703.014 does not adequately portray the survival curve.  Higher-
resolution plots are needed, and these can be obtained by the use of SAS-GRAPH
or other graphics packages.  The following program produces a better-resolution
survival plot for the Minnesota Heart Survey Data, by gender:

=================================================================================

 [lines omitted ...]
		
********************************** END DATA STEP ****************************;

proc print data = heart ;
     where dead eq 1 ;
     var id sex age dthdate cause censor follyrs ;
title1 'List of deaths in the Minnesota Heart Survey' ;


proc lifetest data = heart outsurv = surcurve ;
     time follyrs * censor(1) ;
strata sex ;
title1 'PROC LIFETEST analysis of MWHEART data, by Sex' ;

data surcurve ;
     set surcurve ;
     if _censor_ ne 0 then delete ;

proc print data = surcurve ;
title1 'Print of output file from PROC LIFETEST.' ;

symbol1 i = steplj v = none c = black l = 1 ;
symbol2 i = steplj v = none c = black l = 2 ;

proc gplot data = surcurve ;
     plot survival * follyrs = sex / haxis = axis1 vaxis = axis2 ;
axis1 value = (f = swissb h = 2) order = 0 to 11 by 1
      label = (h = 2 'Followup Time in Years') ;
axis2 value = (f = swissb h = 2) order = .9 to 1.0 by .1
      label = (a = 90 h = 2 'Proportion with Event') ;
title1 h = 2 'PROC LIFETEST Analysis of MWHEART data, by Gender' ;
title2 h = 2 'Survival Proportions versus Sex' ;
run ;

endsas ;

=================================================================================

     The results of this plot can be seen at:

      http://www.biostat.umn.edu/~john-c/mwheart.gplot


     It is useful also in some cases to examine other plots: for example, the
plot of the hazard function or of the log-survival function.

     The hazard function gives the "instantaneous risk rate" as a function of
time.  The units of the hazard function are events per unit time.  For some
survival studies, the hazard function is approximately constant.  This would
be approximately true if the study involved middle-aged people being followed
for a short time-interval (e.g., 2 months).  If people are followed for longer
time intervals, their hazard increases because of their increasing age.  In
surgical studies, the person's hazard is high for a period of time immediately
after the surgery and then decreases.  In leukemia studies, a person's hazard
is extremely high during the period that the bone-marrow is being ablated by
chemotherapeutic drugs.  Then it decreases as the bone-marrow recovers.  Then
it increases again as the person's chances of relapse increase.

     The hazard function for men in the MRFIT study increased as a function of
time because the followup times were quite long, and there was substantial
aging during followup.  This can be seen in the following plot:

     http://www.biostat.umn.edu/~john-c/mrfit.hazard

     This plot indicates first an increasing hazard, then a decreasing hazard
after about 14 years of followup.  This is due to the fact that not many MRFIT
men were followed up to that time-point in this analysis.  The hazard estimates
after that time are based on small numbers and are not reliable.

     This plot also indicates a considerably higher hazard rate for MRFIT men
whose FEV1 as % of predicted was below 85%, than for men whose FEV1 % predicted
was above 85%.  That is, having a low FEV1 % predicted was a risk factor for
death.  It is likely that this effect is due in part to the fact that the men
with low FEV1 % predicted tended to be long-term smokers.


     Another useful plot is the plot of log-survival versus time.  An example
of this kind of plot is given at:

     http://www.biostat.umn.edu/~john-c/mrfit.logsurv

     If the hazard is constant, the graph of log survival as a function of time
should be close to a straight line.  The graph referred to above however shows
that log survival has an increasing slope in both the subgroups (defined as above
by FEV1 % predicted categories).

     The SAS code which produced these graphs was the following:

=================================================================================

 [fragment of program ...]

 YEARSDTH = (DTHDATE - RDATE) / 365.25 ;         *  Definition of death time   ;
 DEATH = 0 ;
 IF YEARSDTH GT 0 THEN DEATH = 1 ;               *  Definition of death var.   ;
 LDATE    = MDY(12, 31, 90) ;                    *  Last followup date         ;
 FOLLOW   = (LDATE - RDATE) / 365.25 ;           *  Definition of followup time;
 IF YEARSDTH GT 0 THEN FOLLOW = YEARSDTH ;       *  Re-defin. of follwup time  ;

[end of fragment ...]

*=================================================================== ;

 SYMBOL1  C = BLACK  I = STEPLJ L = 1 V = NONE WIDTH = 1  H = 1.5 ;
 SYMBOL2  C = GRAY   I = STEPLJ L = 2 V = NONE WIDTH = 1  H = 1.5 ;

 PROC LIFETEST DATA = SEL METHOD = LT PLOTS = (S, LS, H) OUTSURV = SURCURVE NOTABLE ;
      TIME FOLLOW * DEATH(0) ;
      STRATA MFEV13AN ;
 TITLE1 H = 2 'MRFIT DEATHS: 3 TO 15.8 YEARS FOLLOWUP';
 TITLE2 H = 2
 'BY YEAR 3 FEV1 % PREDICTED CATEGORY: < 85 % VS >= 85%' ;
 TITLE3 H = 2 'NHANES USED FOR PREDICTED' ;
 FORMAT MFEV13AN NOYES01X. ;
                                                                                
 ENDSAS ;

=================================================================================

     Note here that the censoring status is define when DEATH = 0.

     The option which produces the survival and hazard plots is in the first line
of PROC LIFETEST:

        PLOTS = (S, LS, H)

     Note that 'S' denotes the survival curve (not shown) and 'LS' the log
survival curve, and 'H' the hazard curve, as described above and shown in the
links.

------------------------------------------------------------------------

TESTING FOR THE EFFECTS OF COVARIATES

     PROC LIFETEST allows for tests of the effects of covariates on survival.
These are essentially nonparametric tests.  In the Minnesota Heart Survey data,
if you want to study the effects of age, systolic blood pressure, and smoking
status on survival, while carrying out an analysis which is stratified for
age, you would modify the PROC LIFETEST procedure as follows.  The output follows
the procedure.

------------------------------------------------------------------------
proc lifetest data = heart outsurv = surcurve notable ;
     time follyrs * censor(1) ;
strata sex ;
test age smoke sbpav2 ;
title1 'PROC LIFETEST analysis of MWHEART data, by Sex' ;
title2 'Tests for the effects of age, smoke and spbav2, stratified by sex' ;
------------------------------------------------------------------------

                 PROC LIFETEST analysis of MWHEART data, by Sex                1
                                                  20:07 Wednesday, April 6, 2005

                             The LIFETEST Procedure

            Summary of the Number of Censored and Uncensored Values

              SEX         Total     Failed   Censored  %Censored

              F             263          7        256    97.3384
              M             221         16        205    92.7602

              Total         484         23        461    95.2479

NOTE: There were 1 observations with missing values, negative time values or 
      frequency values less than 1.


                 PROC LIFETEST analysis of MWHEART data, by Sex                2
                                                  20:07 Wednesday, April 6, 2005

                             The LIFETEST Procedure

               Testing Homogeneity of Survival Curves over Strata
                             Time Variable FOLLYRS


                                Rank Statistics

                       SEX          Log-Rank     Wilcoxon

                       F             -5.5527      -2550.0
                       M              5.5527       2550.0


                 Covariance Matrix for the Log-Rank Statistics

                      SEX                  F             M

                      F              5.69866      -5.69866
                      M             -5.69866       5.69866


                 Covariance Matrix for the Wilcoxon Statistics

                      SEX                  F             M

                      F              1252658      -1252658
                      M             -1252658       1252658


                          Test of Equality over Strata

                                                    Pr >
                     Test      Chi-Square    DF  Chi-Square

                     Log-Rank      5.4105     1      0.0200
                     Wilcoxon      5.1910     1      0.0227
                     -2Log(LR)     5.4513     1      0.0196


                 PROC LIFETEST analysis of MWHEART data, by Sex                3
                                                  20:07 Wednesday, April 6, 2005

                             The LIFETEST Procedure

           Rank Tests for the Association of FOLLYRS with Covariates
                               Pooled over Strata


                  Univariate Chi-Squares for the WILCOXON Test

                        Test       Standard                     Pr >
         Variable    Statistic    Deviation    Chi-Square    Chi-Square
         Label

         AGE            -348.8      57.6615      36.6019       0.0001  
         age at entry                            
         SMOKE          5.9256       3.6641       2.6153       0.1058  
         smoking status                          
         SBPAV2         -326.7      72.5719      20.2612       0.0001  
         Systolic blood pressure (mmHg) - average



                 Covariance Matrix for the WILCOXON Statistics

               Variable           AGE         SMOKE        SBPAV2

               AGE            3324.85          6.10       2111.42
               SMOKE             6.10         13.43          3.47
               SBPAV2         2111.42          3.47       5266.68


         Forward Stepwise Sequence of Chi-Squares for the WILCOXON Test

                                         Pr >       Chi-Square       Pr >
    Variable      DF    Chi-Square    Chi-Square     Increment    Increment
    Label

    AGE            1      36.6019       0.0001        36.6019       0.0001 
    age at entry                            
    SMOKE          2      39.8157       0.0001         3.2138       0.0730 
    smoking status                          
    SBPAV2         3      42.6204       0.0001         2.8047       0.0940 
    Systolic blood pressure (mmHg) - average



                  Univariate Chi-Squares for the LOG RANK Test

                        Test       Standard                     Pr >
         Variable    Statistic    Deviation    Chi-Square    Chi-Square
         Label

         AGE            -359.5      58.0440      38.3608       0.0001  
         age at entry                            
         SMOKE          6.0744       3.8223       2.5256       0.1120  
         smoking status                          
         SBPAV2         -332.6      73.0190      20.7536       0.0001  
         Systolic blood pressure (mmHg) - average
                 PROC LIFETEST analysis of MWHEART data, by Sex                4
                                                  20:07 Wednesday, April 6, 2005

                             The LIFETEST Procedure

                 Covariance Matrix for the LOG RANK Statistics

               Variable           AGE         SMOKE        SBPAV2

               AGE            3369.10          6.34       2085.05
               SMOKE             6.34         14.61          4.08
               SBPAV2         2085.05          4.08       5331.78


         Forward Stepwise Sequence of Chi-Squares for the LOG RANK Test

                                         Pr >       Chi-Square       Pr >
    Variable      DF    Chi-Square    Chi-Square     Increment    Increment
    Label

    AGE            1      38.3608       0.0001        38.3608       0.0001 
    age at entry                            
    SMOKE          2      41.4830       0.0001         3.1222       0.0772 
    smoking status                          
    SBPAV2         3      44.4897       0.0001         3.0067       0.0829 
    Systolic blood pressure (mmHg) - average



=================================================================================

PROBLEM 1:

     Refer to the data set from the Der-Everitt text, Chapter 12, on methadone
methadone treatment among heroin addicts.

     Define a variable called dose60 as follows:

      dose60 = 1 if dose <= 60.
      dose60 = 2 if dose > 60.

(1) Carry out a PROC LIFETEST analysis of the outcome variable 'time',
    with status = 0 indicating censored observations, status = 1
    indicating noncensored, with dose60 as the stratifying variable.
    Explain the results.

(2) Show the survival plot by dose60 as was done in n54703.014.


PROBLEM 2

    See the life-table plots from page 935 of Ojo et al., NEJM 349: 931-940,
2004.  Answer the following questions with explanations and discussion for
your answers:

(1) Why do you think the plots seem to have a stair-step appearance?
(2) Is the evidence that some of the hazards are not constant ?
(3) Do you think that transplanting different organs gives rise to different
    risks of chronic renal failure?  What else might you want to investigate
    before concluding that?
(4) Why do you think the intestine-transplant curve stops before the others do?
(5) Do you have any idea why heart-lung transplants might be less likely
    to have renal failure than the other organ-transplants?
=================================================================================

n54703.015  Last update: April 6, 2005.