SURVIVAL ANALYSIS, II: PROC LIFETEST, contin. n54703.015 Plots are an important part of life-table analysis. The line-printer plot given in n54703.014 does not adequately portray the survival curve. Higher- resolution plots are needed, and these can be obtained by the use of SAS-GRAPH or other graphics packages. The following program produces a better-resolution survival plot for the Minnesota Heart Survey Data, by gender: ================================================================================= [lines omitted ...] ********************************** END DATA STEP ****************************; proc print data = heart ; where dead eq 1 ; var id sex age dthdate cause censor follyrs ; title1 'List of deaths in the Minnesota Heart Survey' ; proc lifetest data = heart outsurv = surcurve ; time follyrs * censor(1) ; strata sex ; title1 'PROC LIFETEST analysis of MWHEART data, by Sex' ; data surcurve ; set surcurve ; if _censor_ ne 0 then delete ; proc print data = surcurve ; title1 'Print of output file from PROC LIFETEST.' ; symbol1 i = steplj v = none c = black l = 1 ; symbol2 i = steplj v = none c = black l = 2 ; proc gplot data = surcurve ; plot survival * follyrs = sex / haxis = axis1 vaxis = axis2 ; axis1 value = (f = swissb h = 2) order = 0 to 11 by 1 label = (h = 2 'Followup Time in Years') ; axis2 value = (f = swissb h = 2) order = .9 to 1.0 by .1 label = (a = 90 h = 2 'Proportion with Event') ; title1 h = 2 'PROC LIFETEST Analysis of MWHEART data, by Gender' ; title2 h = 2 'Survival Proportions versus Sex' ; run ; endsas ; ================================================================================= The results of this plot can be seen at: http://www.biostat.umn.edu/~john-c/mwheart.gplot It is useful also in some cases to examine other plots: for example, the plot of the hazard function or of the log-survival function. The hazard function gives the "instantaneous risk rate" as a function of time. The units of the hazard function are events per unit time. For some survival studies, the hazard function is approximately constant. This would be approximately true if the study involved middle-aged people being followed for a short time-interval (e.g., 2 months). If people are followed for longer time intervals, their hazard increases because of their increasing age. In surgical studies, the person's hazard is high for a period of time immediately after the surgery and then decreases. In leukemia studies, a person's hazard is extremely high during the period that the bone-marrow is being ablated by chemotherapeutic drugs. Then it decreases as the bone-marrow recovers. Then it increases again as the person's chances of relapse increase. The hazard function for men in the MRFIT study increased as a function of time because the followup times were quite long, and there was substantial aging during followup. This can be seen in the following plot: http://www.biostat.umn.edu/~john-c/mrfit.hazard This plot indicates first an increasing hazard, then a decreasing hazard after about 14 years of followup. This is due to the fact that not many MRFIT men were followed up to that time-point in this analysis. The hazard estimates after that time are based on small numbers and are not reliable. This plot also indicates a considerably higher hazard rate for MRFIT men whose FEV1 as % of predicted was below 85%, than for men whose FEV1 % predicted was above 85%. That is, having a low FEV1 % predicted was a risk factor for death. It is likely that this effect is due in part to the fact that the men with low FEV1 % predicted tended to be long-term smokers. Another useful plot is the plot of log-survival versus time. An example of this kind of plot is given at: http://www.biostat.umn.edu/~john-c/mrfit.logsurv If the hazard is constant, the graph of log survival as a function of time should be close to a straight line. The graph referred to above however shows that log survival has an increasing slope in both the subgroups (defined as above by FEV1 % predicted categories). The SAS code which produced these graphs was the following: ================================================================================= [fragment of program ...] YEARSDTH = (DTHDATE - RDATE) / 365.25 ; * Definition of death time ; DEATH = 0 ; IF YEARSDTH GT 0 THEN DEATH = 1 ; * Definition of death var. ; LDATE = MDY(12, 31, 90) ; * Last followup date ; FOLLOW = (LDATE - RDATE) / 365.25 ; * Definition of followup time; IF YEARSDTH GT 0 THEN FOLLOW = YEARSDTH ; * Re-defin. of follwup time ; [end of fragment ...] *=================================================================== ; SYMBOL1 C = BLACK I = STEPLJ L = 1 V = NONE WIDTH = 1 H = 1.5 ; SYMBOL2 C = GRAY I = STEPLJ L = 2 V = NONE WIDTH = 1 H = 1.5 ; PROC LIFETEST DATA = SEL METHOD = LT PLOTS = (S, LS, H) OUTSURV = SURCURVE NOTABLE ; TIME FOLLOW * DEATH(0) ; STRATA MFEV13AN ; TITLE1 H = 2 'MRFIT DEATHS: 3 TO 15.8 YEARS FOLLOWUP'; TITLE2 H = 2 'BY YEAR 3 FEV1 % PREDICTED CATEGORY: < 85 % VS >= 85%' ; TITLE3 H = 2 'NHANES USED FOR PREDICTED' ; FORMAT MFEV13AN NOYES01X. ; ENDSAS ; ================================================================================= Note here that the censoring status is define when DEATH = 0. The option which produces the survival and hazard plots is in the first line of PROC LIFETEST: PLOTS = (S, LS, H) Note that 'S' denotes the survival curve (not shown) and 'LS' the log survival curve, and 'H' the hazard curve, as described above and shown in the links. ------------------------------------------------------------------------ TESTING FOR THE EFFECTS OF COVARIATES PROC LIFETEST allows for tests of the effects of covariates on survival. These are essentially nonparametric tests. In the Minnesota Heart Survey data, if you want to study the effects of age, systolic blood pressure, and smoking status on survival, while carrying out an analysis which is stratified for age, you would modify the PROC LIFETEST procedure as follows. The output follows the procedure. ------------------------------------------------------------------------ proc lifetest data = heart outsurv = surcurve notable ; time follyrs * censor(1) ; strata sex ; test age smoke sbpav2 ; title1 'PROC LIFETEST analysis of MWHEART data, by Sex' ; title2 'Tests for the effects of age, smoke and spbav2, stratified by sex' ; ------------------------------------------------------------------------ PROC LIFETEST analysis of MWHEART data, by Sex 1 20:07 Wednesday, April 6, 2005 The LIFETEST Procedure Summary of the Number of Censored and Uncensored Values SEX Total Failed Censored %Censored F 263 7 256 97.3384 M 221 16 205 92.7602 Total 484 23 461 95.2479 NOTE: There were 1 observations with missing values, negative time values or frequency values less than 1. PROC LIFETEST analysis of MWHEART data, by Sex 2 20:07 Wednesday, April 6, 2005 The LIFETEST Procedure Testing Homogeneity of Survival Curves over Strata Time Variable FOLLYRS Rank Statistics SEX Log-Rank Wilcoxon F -5.5527 -2550.0 M 5.5527 2550.0 Covariance Matrix for the Log-Rank Statistics SEX F M F 5.69866 -5.69866 M -5.69866 5.69866 Covariance Matrix for the Wilcoxon Statistics SEX F M F 1252658 -1252658 M -1252658 1252658 Test of Equality over Strata Pr > Test Chi-Square DF Chi-Square Log-Rank 5.4105 1 0.0200 Wilcoxon 5.1910 1 0.0227 -2Log(LR) 5.4513 1 0.0196 PROC LIFETEST analysis of MWHEART data, by Sex 3 20:07 Wednesday, April 6, 2005 The LIFETEST Procedure Rank Tests for the Association of FOLLYRS with Covariates Pooled over Strata Univariate Chi-Squares for the WILCOXON Test Test Standard Pr > Variable Statistic Deviation Chi-Square Chi-Square Label AGE -348.8 57.6615 36.6019 0.0001 age at entry SMOKE 5.9256 3.6641 2.6153 0.1058 smoking status SBPAV2 -326.7 72.5719 20.2612 0.0001 Systolic blood pressure (mmHg) - average Covariance Matrix for the WILCOXON Statistics Variable AGE SMOKE SBPAV2 AGE 3324.85 6.10 2111.42 SMOKE 6.10 13.43 3.47 SBPAV2 2111.42 3.47 5266.68 Forward Stepwise Sequence of Chi-Squares for the WILCOXON Test Pr > Chi-Square Pr > Variable DF Chi-Square Chi-Square Increment Increment Label AGE 1 36.6019 0.0001 36.6019 0.0001 age at entry SMOKE 2 39.8157 0.0001 3.2138 0.0730 smoking status SBPAV2 3 42.6204 0.0001 2.8047 0.0940 Systolic blood pressure (mmHg) - average Univariate Chi-Squares for the LOG RANK Test Test Standard Pr > Variable Statistic Deviation Chi-Square Chi-Square Label AGE -359.5 58.0440 38.3608 0.0001 age at entry SMOKE 6.0744 3.8223 2.5256 0.1120 smoking status SBPAV2 -332.6 73.0190 20.7536 0.0001 Systolic blood pressure (mmHg) - average PROC LIFETEST analysis of MWHEART data, by Sex 4 20:07 Wednesday, April 6, 2005 The LIFETEST Procedure Covariance Matrix for the LOG RANK Statistics Variable AGE SMOKE SBPAV2 AGE 3369.10 6.34 2085.05 SMOKE 6.34 14.61 4.08 SBPAV2 2085.05 4.08 5331.78 Forward Stepwise Sequence of Chi-Squares for the LOG RANK Test Pr > Chi-Square Pr > Variable DF Chi-Square Chi-Square Increment Increment Label AGE 1 38.3608 0.0001 38.3608 0.0001 age at entry SMOKE 2 41.4830 0.0001 3.1222 0.0772 smoking status SBPAV2 3 44.4897 0.0001 3.0067 0.0829 Systolic blood pressure (mmHg) - average ================================================================================= PROBLEM 1: Refer to the data set from the Der-Everitt text, Chapter 12, on methadone methadone treatment among heroin addicts. Define a variable called dose60 as follows: dose60 = 1 if dose <= 60. dose60 = 2 if dose > 60. (1) Carry out a PROC LIFETEST analysis of the outcome variable 'time', with status = 0 indicating censored observations, status = 1 indicating noncensored, with dose60 as the stratifying variable. Explain the results. (2) Show the survival plot by dose60 as was done in n54703.014. PROBLEM 2 See the life-table plots from page 935 of Ojo et al., NEJM 349: 931-940, 2004. Answer the following questions with explanations and discussion for your answers: (1) Why do you think the plots seem to have a stair-step appearance? (2) Is the evidence that some of the hazards are not constant ? (3) Do you think that transplanting different organs gives rise to different risks of chronic renal failure? What else might you want to investigate before concluding that? (4) Why do you think the intestine-transplant curve stops before the others do? (5) Do you have any idea why heart-lung transplants might be less likely to have renal failure than the other organ-transplants? ================================================================================= n54703.015 Last update: April 6, 2005.