SURVIVAL ANALYSIS, II: PROC LIFETEST, contin. n54703.015
Plots are an important part of life-table analysis. The line-printer plot
given in n54703.014 does not adequately portray the survival curve. Higher-
resolution plots are needed, and these can be obtained by the use of SAS-GRAPH
or other graphics packages. The following program produces a better-resolution
survival plot for the Minnesota Heart Survey Data, by gender:
=================================================================================
[lines omitted ...]
********************************** END DATA STEP ****************************;
proc print data = heart ;
where dead eq 1 ;
var id sex age dthdate cause censor follyrs ;
title1 'List of deaths in the Minnesota Heart Survey' ;
proc lifetest data = heart outsurv = surcurve ;
time follyrs * censor(1) ;
strata sex ;
title1 'PROC LIFETEST analysis of MWHEART data, by Sex' ;
data surcurve ;
set surcurve ;
if _censor_ ne 0 then delete ;
proc print data = surcurve ;
title1 'Print of output file from PROC LIFETEST.' ;
symbol1 i = steplj v = none c = black l = 1 ;
symbol2 i = steplj v = none c = black l = 2 ;
proc gplot data = surcurve ;
plot survival * follyrs = sex / haxis = axis1 vaxis = axis2 ;
axis1 value = (f = swissb h = 2) order = 0 to 11 by 1
label = (h = 2 'Followup Time in Years') ;
axis2 value = (f = swissb h = 2) order = .9 to 1.0 by .1
label = (a = 90 h = 2 'Proportion with Event') ;
title1 h = 2 'PROC LIFETEST Analysis of MWHEART data, by Gender' ;
title2 h = 2 'Survival Proportions versus Sex' ;
run ;
endsas ;
=================================================================================
The results of this plot can be seen at:
http://www.biostat.umn.edu/~john-c/mwheart.gplot
It is useful also in some cases to examine other plots: for example, the
plot of the hazard function or of the log-survival function.
The hazard function gives the "instantaneous risk rate" as a function of
time. The units of the hazard function are events per unit time. For some
survival studies, the hazard function is approximately constant. This would
be approximately true if the study involved middle-aged people being followed
for a short time-interval (e.g., 2 months). If people are followed for longer
time intervals, their hazard increases because of their increasing age. In
surgical studies, the person's hazard is high for a period of time immediately
after the surgery and then decreases. In leukemia studies, a person's hazard
is extremely high during the period that the bone-marrow is being ablated by
chemotherapeutic drugs. Then it decreases as the bone-marrow recovers. Then
it increases again as the person's chances of relapse increase.
The hazard function for men in the MRFIT study increased as a function of
time because the followup times were quite long, and there was substantial
aging during followup. This can be seen in the following plot:
http://www.biostat.umn.edu/~john-c/mrfit.hazard
This plot indicates first an increasing hazard, then a decreasing hazard
after about 14 years of followup. This is due to the fact that not many MRFIT
men were followed up to that time-point in this analysis. The hazard estimates
after that time are based on small numbers and are not reliable.
This plot also indicates a considerably higher hazard rate for MRFIT men
whose FEV1 as % of predicted was below 85%, than for men whose FEV1 % predicted
was above 85%. That is, having a low FEV1 % predicted was a risk factor for
death. It is likely that this effect is due in part to the fact that the men
with low FEV1 % predicted tended to be long-term smokers.
Another useful plot is the plot of log-survival versus time. An example
of this kind of plot is given at:
http://www.biostat.umn.edu/~john-c/mrfit.logsurv
If the hazard is constant, the graph of log survival as a function of time
should be close to a straight line. The graph referred to above however shows
that log survival has an increasing slope in both the subgroups (defined as above
by FEV1 % predicted categories).
The SAS code which produced these graphs was the following:
=================================================================================
[fragment of program ...]
YEARSDTH = (DTHDATE - RDATE) / 365.25 ; * Definition of death time ;
DEATH = 0 ;
IF YEARSDTH GT 0 THEN DEATH = 1 ; * Definition of death var. ;
LDATE = MDY(12, 31, 90) ; * Last followup date ;
FOLLOW = (LDATE - RDATE) / 365.25 ; * Definition of followup time;
IF YEARSDTH GT 0 THEN FOLLOW = YEARSDTH ; * Re-defin. of follwup time ;
[end of fragment ...]
*=================================================================== ;
SYMBOL1 C = BLACK I = STEPLJ L = 1 V = NONE WIDTH = 1 H = 1.5 ;
SYMBOL2 C = GRAY I = STEPLJ L = 2 V = NONE WIDTH = 1 H = 1.5 ;
PROC LIFETEST DATA = SEL METHOD = LT PLOTS = (S, LS, H) OUTSURV = SURCURVE NOTABLE ;
TIME FOLLOW * DEATH(0) ;
STRATA MFEV13AN ;
TITLE1 H = 2 'MRFIT DEATHS: 3 TO 15.8 YEARS FOLLOWUP';
TITLE2 H = 2
'BY YEAR 3 FEV1 % PREDICTED CATEGORY: < 85 % VS >= 85%' ;
TITLE3 H = 2 'NHANES USED FOR PREDICTED' ;
FORMAT MFEV13AN NOYES01X. ;
ENDSAS ;
=================================================================================
Note here that the censoring status is define when DEATH = 0.
The option which produces the survival and hazard plots is in the first line
of PROC LIFETEST:
PLOTS = (S, LS, H)
Note that 'S' denotes the survival curve (not shown) and 'LS' the log
survival curve, and 'H' the hazard curve, as described above and shown in the
links.
------------------------------------------------------------------------
TESTING FOR THE EFFECTS OF COVARIATES
PROC LIFETEST allows for tests of the effects of covariates on survival.
These are essentially nonparametric tests. In the Minnesota Heart Survey data,
if you want to study the effects of age, systolic blood pressure, and smoking
status on survival, while carrying out an analysis which is stratified for
age, you would modify the PROC LIFETEST procedure as follows. The output follows
the procedure.
------------------------------------------------------------------------
proc lifetest data = heart outsurv = surcurve notable ;
time follyrs * censor(1) ;
strata sex ;
test age smoke sbpav2 ;
title1 'PROC LIFETEST analysis of MWHEART data, by Sex' ;
title2 'Tests for the effects of age, smoke and spbav2, stratified by sex' ;
------------------------------------------------------------------------
PROC LIFETEST analysis of MWHEART data, by Sex 1
20:07 Wednesday, April 6, 2005
The LIFETEST Procedure
Summary of the Number of Censored and Uncensored Values
SEX Total Failed Censored %Censored
F 263 7 256 97.3384
M 221 16 205 92.7602
Total 484 23 461 95.2479
NOTE: There were 1 observations with missing values, negative time values or
frequency values less than 1.
PROC LIFETEST analysis of MWHEART data, by Sex 2
20:07 Wednesday, April 6, 2005
The LIFETEST Procedure
Testing Homogeneity of Survival Curves over Strata
Time Variable FOLLYRS
Rank Statistics
SEX Log-Rank Wilcoxon
F -5.5527 -2550.0
M 5.5527 2550.0
Covariance Matrix for the Log-Rank Statistics
SEX F M
F 5.69866 -5.69866
M -5.69866 5.69866
Covariance Matrix for the Wilcoxon Statistics
SEX F M
F 1252658 -1252658
M -1252658 1252658
Test of Equality over Strata
Pr >
Test Chi-Square DF Chi-Square
Log-Rank 5.4105 1 0.0200
Wilcoxon 5.1910 1 0.0227
-2Log(LR) 5.4513 1 0.0196
PROC LIFETEST analysis of MWHEART data, by Sex 3
20:07 Wednesday, April 6, 2005
The LIFETEST Procedure
Rank Tests for the Association of FOLLYRS with Covariates
Pooled over Strata
Univariate Chi-Squares for the WILCOXON Test
Test Standard Pr >
Variable Statistic Deviation Chi-Square Chi-Square
Label
AGE -348.8 57.6615 36.6019 0.0001
age at entry
SMOKE 5.9256 3.6641 2.6153 0.1058
smoking status
SBPAV2 -326.7 72.5719 20.2612 0.0001
Systolic blood pressure (mmHg) - average
Covariance Matrix for the WILCOXON Statistics
Variable AGE SMOKE SBPAV2
AGE 3324.85 6.10 2111.42
SMOKE 6.10 13.43 3.47
SBPAV2 2111.42 3.47 5266.68
Forward Stepwise Sequence of Chi-Squares for the WILCOXON Test
Pr > Chi-Square Pr >
Variable DF Chi-Square Chi-Square Increment Increment
Label
AGE 1 36.6019 0.0001 36.6019 0.0001
age at entry
SMOKE 2 39.8157 0.0001 3.2138 0.0730
smoking status
SBPAV2 3 42.6204 0.0001 2.8047 0.0940
Systolic blood pressure (mmHg) - average
Univariate Chi-Squares for the LOG RANK Test
Test Standard Pr >
Variable Statistic Deviation Chi-Square Chi-Square
Label
AGE -359.5 58.0440 38.3608 0.0001
age at entry
SMOKE 6.0744 3.8223 2.5256 0.1120
smoking status
SBPAV2 -332.6 73.0190 20.7536 0.0001
Systolic blood pressure (mmHg) - average
PROC LIFETEST analysis of MWHEART data, by Sex 4
20:07 Wednesday, April 6, 2005
The LIFETEST Procedure
Covariance Matrix for the LOG RANK Statistics
Variable AGE SMOKE SBPAV2
AGE 3369.10 6.34 2085.05
SMOKE 6.34 14.61 4.08
SBPAV2 2085.05 4.08 5331.78
Forward Stepwise Sequence of Chi-Squares for the LOG RANK Test
Pr > Chi-Square Pr >
Variable DF Chi-Square Chi-Square Increment Increment
Label
AGE 1 38.3608 0.0001 38.3608 0.0001
age at entry
SMOKE 2 41.4830 0.0001 3.1222 0.0772
smoking status
SBPAV2 3 44.4897 0.0001 3.0067 0.0829
Systolic blood pressure (mmHg) - average
=================================================================================
PROBLEM 1:
Refer to the data set from the Der-Everitt text, Chapter 12, on methadone
methadone treatment among heroin addicts.
Define a variable called dose60 as follows:
dose60 = 1 if dose <= 60.
dose60 = 2 if dose > 60.
(1) Carry out a PROC LIFETEST analysis of the outcome variable 'time',
with status = 0 indicating censored observations, status = 1
indicating noncensored, with dose60 as the stratifying variable.
Explain the results.
(2) Show the survival plot by dose60 as was done in n54703.014.
PROBLEM 2
See the life-table plots from page 935 of Ojo et al., NEJM 349: 931-940,
2004. Answer the following questions with explanations and discussion for
your answers:
(1) Why do you think the plots seem to have a stair-step appearance?
(2) Is the evidence that some of the hazards are not constant ?
(3) Do you think that transplanting different organs gives rise to different
risks of chronic renal failure? What else might you want to investigate
before concluding that?
(4) Why do you think the intestine-transplant curve stops before the others do?
(5) Do you have any idea why heart-lung transplants might be less likely
to have renal failure than the other organ-transplants?
=================================================================================
n54703.015 Last update: April 6, 2005.