SAS MACROS, Continued SPH 5421 notes.025 SAS Macros - SAS/GRAPH, Continued ... Graphs can be highly customized using SAS/GRAPH macros. The following is an example of a macro that we have used extensively in the Lung Health Study for looking at longitudinal data. The first section below is the part of the main program that calls the macro. The second section is the macro itself. The third section discusses various features of this macro. *============================================================================== ; %include 'gmac1d.sas' ; %GREG1D (ENDPTS, WHERE AGE NE ., AGEGE55, FOLLOWUP TIME (YEARS),0 TO 4.5 BY .5, 0,YRSM06,YRSM12,YRSM18,YRSM24,YRSM30,YRSM36,YRSM42,YRSM48, MEAN POST-BD FEV1 (L),0.0 TO 2.8 BY .2, FEV1200, FEV1206, FEV1212, FEV1218, FEV1224, FEV1230, FEV1236, FEV1242, FEV1248, LUNG HEALTH STUDY 2, MEANS OF POST-BD FEV1 (L), BY AGE GROUP: LT 55 VERSUS GE 55, L,G, LT 55, GE 55, (BOTTOM LEFT INSIDE) OFFSET = (+4, 32), 14, 26) ; ENDSAS ; *============================================================================== ; %MACRO GREG1D (DATASET, WHERE, CLASSVAR, TLABEL, TORDER, T1, T2, T3, T4, T5, T6, T7, T8, T9, YLABEL, YORDER, W1, W2, W3, W4, W5, W6, W7, W8, W9, GTITLE1, GTITLE2, GTITLE3, SYMB1, SYMB2, DESC1, DESC2, POSITLEG, XPOSSTAT, YPOSSTAT) ; /*****************************************************************************/ /* */ /* Macro to graph data for two groups defined by CLASSVAR = 1 or 2. */ /* */ /* The model assumption for the Z-test mentioned below is that the data */ /* are linear functions of time. */ /* */ /* The data are defined at up to 9 time points, T1 - T9. The outcome */ /* variables are W1 - W9. SYMB1 and SYMB2 are symbols for graphing the */ /* data for group 1 and group 2 respectively. DESC1 and DESC2 are text */ /* descriptions of the two groups. POSITLEG is the position of the LEGEND, */ /* (see the POSITION and OFFSET descriptions of the LEGEND statement). */ /* XPOSSTAT and YPOSSTAT are position coordinates for a box containing info */ /* on the intercepts and slopes and their standard devs for each group, */ /* and z-statistics for comparisons of the two intercepts and slopes. */ /* */ /* DATASET : Name of SAS dataset containing the data. */ /* CLASSVAR : Classifying variable: Takes on values 1, 2. */ /* TLABEL : Label for the x-axis (time variable usually) */ /* TORDER : Order statement for the time variable axis (e.g., 0 to 60) */ /* T1 ... T9 : Names of the time variables ... only 9 allowed. */ /* YLABEL : Label for the vertical axis (outcome variable) */ /* YORDER : Order statement for the outcome variable axis. */ /* W1 ... W9 : Names of the outcome variable (one for each time point) */ /* GTITLE1 : Line 1 of the graph title. */ /* GTITLE2 : Line 2 of the graph title. */ /* GTITLE3 : Line 3 of the graph title. */ /* SYMB1 : Plotting symbol for the first group (CLASSVAR = 1) */ /* SYMB2 : Plotting symbol for the second group (CLASSVAR = 2) */ /* DESC1 : Description of first group for the LEGEND box. */ /* DESC2 : Description of second group for the LEGEND box. */ /* POSITLEG : Position of the LEGEND box. See SAS/GRAPH Vol 1 for syntax. */ /* Example: (BOTTOM RIGHT INSIDE) OFFSET = (10, 10) */ /* XPOSSTAT : X-axis position of the statistics box: 0-100, as percent. */ /* YPOSSTAT : Y-axis position of the stats box. Note 0 = bottom, 100 = top.*/ /* */ /* NOTE: In this version, if either YORDER or TORDER is blank, then the */ /* program defaults to using no ORDER statements for either axis. */ /* Also this version offsets the t-values by .05 for CLASSVAR = 2. */ /* */ /* NOTE: This version of the program prints standard deviations of */ /* slopes and intercepts, not standard errors. */ /* */ /* NOTE: The WHERE clause has been added. /* */ /* Update: Feb 24, 1999 J.E.C. */ /*****************************************************************************/ footnote2 'macro greg1d.sas (jec))' ; PROC SORT DATA = &DATASET ; BY &CLASSVAR ; DATA REG0 ; SET &DATASET END = LAST ; &WHERE ; RETAIN NB 0 B1SUM 0 B12SUM 0 B1HATMIN 9999 B1HATMAX -9999 B0SUM 0 B02SUM 0 B0HATMIN 9999 B0HATMAX -9999 B01SUM 0 ; ARRAY X(9) X1 - X9 ; ARRAY Y(9) Y1 - Y9 ; ARRAY XY(9) XY1 - XY9 ; ARRAY XX(9) XX1 - XX9 ; ARRAY YY(9) YY1 - YY9 ; KEEP X1 X2 X3 X4 X5 X6 X7 X8 X9 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 XX1 XX2 XX3 XX4 XX5 XX6 XX7 XX8 XX9 YY1 YY2 YY3 YY4 YY5 YY6 YY7 YY8 YY9 XY1 XY2 XY3 XY4 XY5 XY6 XY7 XY8 XY9 NSUM XSUM YSUM X2SUM XYSUM B0HAT B1HAT NB B0SUM B02SUM B0HATMIN B0HATMAX B1SUM B12SUM B1HATMIN B1HATMAX B01SUM &CLASSVAR ; VLAST = LAST ; I1 = . ; I2 = . ; I3 = . ; I4 = . ; I5 = . ; I6 = . ; I7 = . ; I8 = . ; I9 = . ; IF &T1 NE . AND &W1 NE . THEN I1 = 1 ; IF &T2 NE . AND &W2 NE . THEN I2 = 1 ; IF &T3 NE . AND &W3 NE . THEN I3 = 1 ; IF &T4 NE . AND &W4 NE . THEN I4 = 1 ; IF &T5 NE . AND &W5 NE . THEN I5 = 1 ; IF &T6 NE . AND &W6 NE . THEN I6 = 1 ; IF &T7 NE . AND &W7 NE . THEN I7 = 1 ; IF &T8 NE . AND &W8 NE . THEN I8 = 1 ; IF &T9 NE . AND &W9 NE . THEN I9 = 1 ; X1 = I1*&T1; X2 = I2*&T2; X3 = I3*&T3; X4 = I4*&T4; X5 = I5*&T5; X6 = I6*&T6; X7 = I7*&T7; X8 = I8*&T8; X9 = I9*&T9; Y1 = I1*&W1; Y2 = I2*&W2; Y3 = I3*&W3; Y4 = I4*&W4; Y5 = I5*&W5; Y6 = I6*&W6; Y7 = I7*&W7; Y8 = I8*&W8; Y9 = I9*&W9; XX1 = X1*X1; XX2 = X2*X2; XX3 = X3*X3; XX4 = X4*X4; XX5 = X5*X5; XX6 = X6*X6; XX7 = X7*X7; XX8 = X8*X8; XX9 = X9*X9; YY1 = Y1*Y1; YY2 = Y2*Y2; YY3 = Y3*Y3; YY4 = Y4*Y4; YY5 = Y5*Y5; YY6 = Y6*Y6; YY7 = Y7*Y7; YY8 = Y8*Y8; YY9 = Y9*Y9; XY1 = X1*Y1; XY2 = X2*Y2; XY3 = X3*Y3; XY4 = X4*Y4; XY5 = X5*Y5; XY6 = X6*Y6; XY7 = X7*Y7; XY8 = X8*Y8; XY9 = X9*Y9; NSUM = N(XY1, XY2, XY3, XY4, XY5, XY6, XY7, XY8, XY9) ; XSUM = SUM(X1, X2, X3, X4, X5, X6, X7, X8, X9) ; YSUM = SUM(Y1, Y2, Y3, Y4, Y5, Y6, Y7, Y8, Y9) ; X2SUM = SUM(XX1, XX2, XX3, XX4, XX5, XX6, XX7, XX8, XX9) ; XYSUM = SUM(XY1, XY2, XY3, XY4, XY5, XY6, XY7, XY8, XY9) ; B1HAT = . ; TOP = XYSUM - XSUM * YSUM / NSUM ; BOT = X2SUM - XSUM * XSUM / NSUM ; IF BOT GT 0 THEN B1HAT = TOP / BOT ; B0HAT = . ; XAVE = . ; IF NSUM GT 0 THEN XAVE = XSUM / NSUM ; YAVE = . ; IF NSUM GT 0 THEN YAVE = YSUM / NSUM ; IF YAVE NE 0 THEN B0HAT = YAVE - B1HAT * XAVE ; IF B1HAT NE . THEN DO ; NB = NB + 1 ; B0SUM = B0SUM + B0HAT ; B02SUM = B02SUM + B0HAT * B0HAT ; B01SUM = B01SUM + B0HAT * B1HAT ; IF B0HAT LT B0HATMIN THEN B0HATMIN = B0HAT ; IF B0HAT GT B0HATMAX THEN B0HATMAX = B0HAT ; B1SUM = B1SUM + B1HAT ; B12SUM = B12SUM + B1HAT * B1HAT ; IF B1HAT LT B1HATMIN THEN B1HATMIN = B1HAT ; IF B1HAT GT B1HATMAX THEN B1HATMAX = B1HAT ; END ; /*PROC PRINT DATA = REG0 ; */ /* WHERE B1HAT NE . AND B1HAT LT -10 ; */ /* VAR LHS2ID X1 X2 X3 X4 X5 X6 X7 X8 X9 */ /* Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 */ /* NSUM XSUM YSUM X2SUM XYSUM */ /* B1HAT B0HAT ; */ PROC MEANS DATA = REG0 N MEAN STD STDERR MIN MAX ; CLASS &CLASSVAR ; VAR B0HAT B1HAT X1 X2 X3 X4 X5 X6 X7 X8 X9 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 ; OUTPUT OUT = BMEANS MEAN = B0HAT B1HAT STDERR = SEB0HAT SEB1HAT STD = SDB0HAT SDB1HAT ; PROC PRINT DATA = BMEANS ; TITLE1 'PROC PRINT FOR DATASET BMEANS' ; DATA ANNOB0 ; RETAIN B0HAT1 SEB0HAT1 SDB0HAT1 VB0HAT1 B1HAT1 SEB1HAT1 SDB1HAT1 VB1HAT1 ; LENGTH TEXT $72 ; SET BMEANS ; IF &CLASSVAR EQ 1 THEN DO ; B0HAT1 = B0HAT ; SEB0HAT1 = SEB0HAT ; SDB0HAT1 = SDB0HAT ; VB0HAT1 = SEB0HAT1 * SEB0HAT1 ; B1HAT1 = B1HAT ; SEB1HAT1 = SEB1HAT ; SDB1HAT1 = SDB1HAT ; VB1HAT1 = SEB1HAT1 * SEB1HAT1 ; END ; IF &CLASSVAR EQ 2 THEN DO ; B0HAT2 = B0HAT ; SEB0HAT2 = SEB0HAT ; SDB0HAT2 = SDB0HAT ; VB0HAT2 = SEB0HAT2 * SEB0HAT2 ; VARCOMB0 = VB0HAT1 + VB0HAT2 ; SERCOMB0 = SQRT(VARCOMB0) ; A0DIFF12 = ABS(B0HAT1 - B0HAT2) ; Z0DIFF12 = A0DIFF12 / SERCOMB0 ; P0DIFF12 = 2 * (1 - PROBNORM(Z0DIFF12)) ; B1HAT2 = B1HAT ; SEB1HAT2 = SEB1HAT ; SDB1HAT2 = SDB1HAT ; VB1HAT2 = SEB1HAT2 * SEB1HAT2 ; VARCOMB1 = VB1HAT1 + VB1HAT2 ; SERCOMB1 = SQRT(VARCOMB1) ; A1DIFF12 = ABS(B1HAT1 - B1HAT2) ; Z1DIFF12 = A1DIFF12 / SERCOMB1 ; P1DIFF12 = 2 * (1 - PROBNORM(Z1DIFF12)) ; FUNCTION = 'LABEL' ; POSITION = '>' ; X = &XPOSSTAT ; Y = &YPOSSTAT ; TEXT = 'REGRESSIONS (TWO STRAIGHT LINES): MEANS +/- STD DEV' ; OUTPUT ; Y = Y - 2 ; FUNCTION = 'LABEL' ; POSTION = '>' ; TEXTA = LEFT(PUT(B0HAT1, 7.3)) ; TEXTB = LEFT(PUT(SDB0HAT1, 7.3)) ; TEXTC = LEFT(PUT(B1HAT1, 7.3)) ; TEXTD = LEFT(PUT(SDB1HAT1, 7.3)) ; TEXTE = "&SYMB1" || ' INTCPT: ' || TEXTA || ' +/- ' || TEXTB ; TEXT = TEXTE || ' SLOPE: ' || TEXTC || ' +/- ' || TEXTD ; OUTPUT ; Y = Y - 2 ; FUNCTION = 'LABEL' ; POSITION = '>' ; TEXTA = LEFT(PUT(B0HAT2, 7.3)) ; TEXTB = LEFT(PUT(SDB0HAT2, 7.3)) ; TEXTC = LEFT(PUT(B1HAT2, 7.3)) ; TEXTD = LEFT(PUT(SDB1HAT2, 7.3)) ; TEXTE = "&SYMB2" || ' INTCPT: ' || TEXTA || ' +/- ' || TEXTB ; TEXT = TEXTE || ' SLOPE: ' || TEXTC || ' +/- ' || TEXTD ; OUTPUT ; Y = Y - 3 ; FUNCTION = 'LABEL' ; POSITION = '>' ; TEXTC = LEFT(PUT(Z0DIFF12, 6.3)) ; TEXTD = LEFT(PUT(P0DIFF12, 7.4)) ; TEXT = 'INTCPT Z-STAT = ' || TEXTC || ' P-VALUE = ' || TEXTD ; OUTPUT ; Y = Y - 2 ; FUNCTION = 'LABEL' ; POSITION = '>' ; TEXTC = LEFT(PUT(Z1DIFF12, 6.3)) ; TEXTD = LEFT(PUT(P1DIFF12, 7.4)) ; TEXT = 'SLOPE Z-STAT = ' || TEXTC || ' P-VALUE = ' || TEXTD ; OUTPUT ; FUNCTION = 'MOVE' ; X = &XPOSSTAT - 1 ; Y = &YPOSSTAT + 2 ; OUTPUT ; FUNCTION = 'DRAW' ; X = X + 40 ; Y = Y ; OUTPUT ; FUNCTION = 'DRAW' ; X = X ; Y = Y - 13 ; OUTPUT ; FUNCTION = 'DRAW' ; X = X - 40 ; Y = Y ; OUTPUT ; FUNCTION = 'DRAW' ; X = X ; Y = Y + 13 ; OUTPUT ; END ; DATA BYTIMES ; SET REG0 ; KEEP &CLASSVAR TIME XVAR YVAR X1 X2 X3 Y1 Y2 Y3 ; ARRAY X(9) X1 - X9 ; ARRAY Y(9) Y1 - Y9 ; ARRAY XY(9) XY1 - XY9 ; ARRAY XX(9) XX1 - XX9 ; ARRAY YY(9) YY1 - YY9 ; DO I = 1 TO 9 ; TIME = I ; XVAR = X(I) ; YVAR = Y(I) ; IF XY(I) NE . THEN OUTPUT ; END ; PROC MEANS DATA = REG0 N MEAN STD STDERR MIN MAX ; CLASS &CLASSVAR ; VAR B0HAT B1HAT X1 X2 X3 Y1 Y2 Y3 ; TITLE1 'MEANS OF REG0 AFTER BYTIMES WAS CREATED ...' ; PROC MEANS DATA = BYTIMES N MEAN STD STDERR MIN MAX ; CLASS &CLASSVAR TIME ; VAR XVAR YVAR X1 X2 X3 Y1 Y2 Y3 ; OUTPUT OUT = GMEANS N = NX NY MEAN = XMEAN YMEAN STDERR = XSTDERR YSTDERR ; TITLE1 ">ITLE1" ; TITLE2 ">ITLE2" ; TITLE3 ">ITLE3" ; DATA ANNONUMS ; SET GMEANS ; IF &CLASSVAR EQ . OR TIME EQ . THEN DELETE ; FUNCTION = 'LABEL' ; XSYS = '2' ; YSYS = '1' ; POSITION = '6' ; SIZE = 1 ; IF TIME EQ 1 THEN DO ; X = XMEAN ; Y = 11 - 3 * &CLASSVAR ; TEXTA = "&SYMB1" || ': ' ; IF &CLASSVAR EQ 2 THEN TEXTA = "&SYMB2" || ': ' ; TEXTB = LEFT(PUT(NX, 4.)) ; TEXT = TEXTA || TEXTB ; END ; ELSE IF TIME GT 1 THEN DO ; X = XMEAN ; Y = 11 - 3 * &CLASSVAR ; TEXT = LEFT(PUT(NX, 4.)) ; END; DATA ANNOTOT ; SET ANNOB0 ANNONUMS ; PROC PRINT DATA = ANNOTOT ; DATA TRIPLES ; SET GMEANS ; IF &CLASSVAR EQ . OR TIME EQ . THEN DELETE ; XV = XMEAN + .05 * (&CLASSVAR - 1) ; YV = YMEAN + 2 * YSTDERR ; OUTPUT ; YV = YMEAN - 2 * YSTDERR ; OUTPUT ; YV = YMEAN ; OUTPUT ; PROC PRINT DATA = TRIPLES ; SYMBOL1 I = HILOCTJ W = 2 H = 2 L = 1 V = &SYMB1 C = GRAY00 F = SWISS ; SYMBOL2 I = HILOCTJ W = 3 H = 2 L =33 V = &SYMB2 C = GRAY80 F = SWISS ; LEGEND1 ACROSS = 1 FRAME LABEL = NONE VALUE = (TICK = 1 H = 2 C = BLACK "&DESC1" TICK = 2 H = 2 C = GRAY80 "&DESC2") POSITION = &POSITLEG MODE = PROTECT ; %IF %LENGTH(&YORDER) EQ 0 %THEN %GOTO VERS2 ; %IF %LENGTH(&TORDER) EQ 0 %THEN %GOTO VERS2 ; PROC GPLOT DATA = TRIPLES ANNOTATE = ANNOTOT ; PLOT YV * XV = &CLASSVAR / FRAME LEGEND = LEGEND1 HAXIS = AXIS1 VAXIS = AXIS2 ; AXIS1 COLOR = BLACK VALUE = (F = SWISS H = 2) ORDER = &TORDER LABEL = (C = BLACK F = SWISS H = 2.5 "&TLABEL") ; AXIS2 COLOR = BLACK VALUE = (F = SWISS H = 2) ORDER = &YORDER LABEL = (C = BLACK A = 90 F = SWISS H = 2.5 "&YLABEL") ; TITLE1 J = C H = 2.5 F = SWISS ">ITLE1" ; TITLE2 J = C H = 2.5 F = SWISS ">ITLE2" ; TITLE3 J = C H = 2.5 F = SWISS ">ITLE3" ; RUN ; %GOTO ENDJUMP ; %VERS2: ; PROC GPLOT DATA = TRIPLES ANNOTATE = ANNOTOT ; PLOT YV * XV = &CLASSVAR / FRAME LEGEND = LEGEND1 HAXIS = AXIS1 VAXIS = AXIS2 ; AXIS1 COLOR = BLACK VALUE = (F = SWISS H = 2) LABEL = (C = BLACK F = SWISS H = 2.5 "&TLABEL") ; AXIS2 COLOR = BLACK VALUE = (F = SWISS H = 2) LABEL = (C = BLACK A = 90 F = SWISS H = 2.5 "&YLABEL") ; TITLE1 J = C H = 2.5 F = SWISS ">ITLE1" ; TITLE2 J = C H = 2.5 F = SWISS ">ITLE2" ; TITLE3 J = C H = 2.5 F = SWISS ">ITLE3" ; RUN ; %ENDJUMP: ; %MEND ; * =========================================================================== ; The macro greg1d above includes several features that may be useful: 1. The first parameter, DATASET, specifies which SAS dataset the macro will be dealing with. 2. The second parameter, WHERE, is intended to be a phrase which can define a subset of the dataset; for example, in the calling program, WHERE is specified as WHERE AGE NE . 3. A number of arrays are defined internally in the macro. These are used to store values which are used in computing slopes and intercepts of lines. Putting the values in arrays simplifies the calculations. 4. The OUTPUT OUT = BMEANS statement in PROC MEANS includes options for putting the means, standard errors, and standard deviations in the dataset BMEANS. These are used later in computing confidence intervals for the means of the outcome variable (W1, W2, ..., W9) at each time point. 5. A complicated annotation dataset 'ANNOB0' is constructed, based on statistics in the dataset BMEANS. The data from the ANNOTATE dataset is used later in PROC GPLOT to print the summary statistics in a box on the graph. Most of the content of ANNOTATE is text about the statistics that goes in the box. The FUNCTION = 'LABEL' statements cause output of the text. The dataset ANNOB0 also contains instructions to draw a box around around the statistics summary. This is accomplished by using FUNCTION = 'DRAW' and FUNCTION = 'MOVE' statements. 6. A second annotation dataset, ANNONUMS, is constructed to produce the counts of observations in the two groups at each time point (printed along the bottom of the graph; see graph). 7. Both of the annotate datasets are put together in one dataset, ANNOTOT. ANNOTOT is what is used the PROC GPLOT line: PROC GPLOT DATA = TRIPLES ANNOTATE = ANNOTOT ; 8. The dataset TRIPLES is based on output from a PROC MEANS procedure. It is comprised of the mean values for the x-variable and the y-variable at each time point. It actually includes three y observations for each time point: the mean, the mean + 2 std errs, and the mean - 2 std errs. The SYMBOL statements, SYMBOL1 I = HILOCTJ W = 2 H = 2 L = 1 V = &SYMB1 C = GRAY00 F = SWISS; SYMBOL2 I = HILOCTJ W = 3 H = 2 L = 1 V = &SYMB2 C = GRAY80 F = SWISS; tell GPLOT to plot the mean value and error bars above and below the mean, and join the successive means together (HILOCTJ). Note that the plotting symbols &SYMB1 and &SYMB2 are specified in the call to the macro. This is admittedly a complex and difficult macro. It shows some of the capabilities of SAS/GRAPH, but it also indicates that you need to know quite a few details of how SAS/GRAPH works to write something like this. The next program, bwlife.sas, produces life-table graphs: ================================================================================== %MACRO BWLIFE (DATASET, WHERE, NSTRATA, STRATVAR, STRATFMT, TVAR, TLABEL, TORDER, CENSVAR, CENSLABL, CORDER, GTITLE1, GTITLE2, GTITLE3) ; /*****************************************************************************/ /* */ /* Macro to graph survival data for several strata, and print the */ /* relevant statistics (Log-rank, Wilcoxon, -2 * Log Likelihood). */ /* */ /* DATASET : Name of SAS dataset containing the data */ /* WHERE : Where statement, such as WHERE X GT 0 */ /* NSTRATA : Number of strata - no more than 9, for labels to work. */ /* STRATVAR : Stratifying variable [class variable] */ /* STRATFMT : Format definition for stratifying variable */ /* Note: formats should be 22 characters or less. */ /* TVAR : Time variable */ /* TLABEL : Label (goes below horiz axis) for Time variable */ /* TORDER : Order statement for Time variable */ /* */ /* CENSVAR : Censoring variable: value 0 indicates censored obsn, */ /* and 1 indicates event. */ /* CENSLABL : Label for censoring variable (vertical axis) */ /* CORDER : Order statement for vertical axis */ /* */ /* GTITLE1 : 1st line of graph Title */ /* GTITLE2 : 2nd line of graph Title */ /* GTITLE3 : 3rd line of graph Title */ /* */ /* Author : J. Connett. Last modify date: 11-27-99. */ /* */ /* Note: This macro is like dlife.sas but black & white. */ /* */ /* This is also similar to clife.sas, except the vertical */ /* label is always 'Proportion | No Event'. */ /* */ /*****************************************************************************/ *=====================================================================; GOPTIONS RESET = GLOBAL VPOS = 78 HPOS = 85 HSIZE = 21 cm VSIZE = 14 cm ROTATE = LANDSCAPE FTEXT = NONE DEVICE = PS GACCESS = SASGASTD GSFNAME = GRAPH GUNIT = PCT BORDER CBACK = WHITE CTEXT = BLACK FTEXT = NONE CHARTYPE = 11 HTITLE = 1.5 HTEXT = 1.5 ; *=====================================================================; DATA GLIFE ; SET &DATASET ; &WHERE ; KEEP &TVAR &CENSVAR &STRATVAR STRATLAB STRATUM ; STRATLAB = PUT(&STRATVAR, &STRATFMT) ; STRATUM = &STRATVAR ; PROC MEANS DATA = GLIFE N SUM MEAN STD STDERR MIN MAX ; WHERE STRATUM NE . AND &TVAR GE 0 AND &CENSVAR GE 0 ; CLASS STRATLAB ; VAR &TVAR &CENSVAR ; OUTPUT OUT = GMEANS N = NTVAR NCENSVAR SUM = STVAR SCENSVAR MEAN = MTVAR MCENSVAR ; DATA ANNO1 ; RETAIN COUNT 0 ; SET GMEANS ; WHERE _TYPE_ EQ 1 AND NTVAR NE . AND NCENSVAR NE . ; ANNO1 = 1 ; FAILED = SCENSVAR ; TOTAL = NCENSVAR ; PCTFAIL = 100 * FAILED / TOTAL ; COUNT = COUNT + 1 ; PROC SORT DATA = ANNO1 ; BY DESCENDING COUNT ; PROC PRINT DATA = ANNO1 ; TITLE1 'PRINTOUT OF DATASET ANNO1 SORTED BY COUNT, DESCENDING' ; PROC PRINTTO PRINT = 'temp.out' NEW ; PROC LIFETEST DATA = GLIFE OUTSURV = SURCURVE NOTABLE ; TIME &TVAR * &CENSVAR(0) ; STRATA STRATUM ; TITLE1 ">ITLE1" ; TITLE2 ">ITLE2" ; TITLE3 ">ITLE3" ; RUN ; PROC PRINTTO ; RUN ; DATA SURCURVE ; RETAIN TMAX 0 ; SET SURCURVE ; IF _CENSOR_ NE 0 THEN DELETE ; IF &TVAR GT TMAX THEN TMAX = &TVAR ; UMAX = 1 ; /* PROC PRINT DATA = SURCURVE ; */ TITLE1 'FIRST PRINT OF SURCURVE ...' ; DATA YMAX ; KEEP STRATUM TMAX ; SET SURCURVE ; BY UMAX ; IF LAST.UMAX NE 1 THEN DELETE ; DO ISTRAT = 1 TO &NSTRATA ; STRATUM = ISTRAT ; OUTPUT ; END ; /* PROC PRINT DATA = YMAX ; */ PROC SORT DATA = SURCURVE ; BY STRATUM &TVAR ; DATA SURCURVE ; RETAIN TTVAR SURVVAR ; SET SURCURVE YMAX ; BY STRATUM ; LASTOBS = 0 ; IF &TVAR NE . THEN TTVAR = &TVAR ; IF &TVAR EQ . THEN TTVAR = TMAX ; IF &TVAR NE . THEN SURVVAR = SURVIVAL ; IF LAST.STRATUM EQ 1 THEN LASTOBS = 1 ; /* PROC PRINT DATA = SURCURVE ; */ TITLE1 'SECOND PRINT OF SURCURVE ...' ; DATA STATS ; RETAIN NGROUP 0 ; INFILE 'temp.out' ; INPUT TEST$@ ; IF TEST EQ 'Log-Rank' THEN DO ; INPUT CHISQ DF PVAL ; LTEST = 1 ; OUTPUT ; END ; IF TEST EQ 'Wilcoxon' THEN DO ; INPUT CHISQ DF PVAL ; LTEST = 2 ; OUTPUT ; END ; IF TEST EQ '-2Log(LR' THEN DO ; INPUT CHISQ DF PVAL ; LTEST = 3 ; OUTPUT ; END ; RUN ; PROC SORT DATA = STATS ; BY LTEST ; /* PROC PRINT DATA = STATS ; */ TITLE1 'PRINTOUT OF DATASET STATS, SORTED BY LTEST' ; DATA ANNO ; RETAIN CASE 0 Y TOP ALINE 0 MAXCOUNT 0 ; LENGTH TEXT $72 TEXTE $7 TEXTF $8 TEXTG $9 TEXTH $22 ; SET ANNO1 STATS SURCURVE ; COLOR = 'BLACK' ; XSYS = '1' ; YSYS = '1' ; SIZE = 3.00 ; HSYS = '1' ; STYLE = 'NONE' ; YF = 2.0 ; TOP2 = 5 * YF ; IF MAXCOUNT LT COUNT THEN MAXCOUNT = COUNT ; IF ANNO1 EQ 1 THEN DO ; FUNCTION = 'LABEL' ; POSITION = '>' ; ALINE = ALINE + 1 ; X = 2 ; Y = TOP2 + YF + 3 * YF * ALINE ; TEXT = LEFT(PUT(STRATLAB, CHAR25.)) ; OUTPUT ; X = 24 ; TEXT = RIGHT(PUT(TOTAL, 7.0)) ; OUTPUT ; X = 31 ; TEXT = RIGHT(PUT(FAILED, 8.0)) ; OUTPUT ; X = 40 ; TEXT = RIGHT(PUT(PCTFAIL, 9.2)) ; OUTPUT ; X = 47 ; TEXT = ' %' ; OUTPUT ; END ; TOP = TOP2 + 7 * YF + 3 * YF * MAXCOUNT ; IF ANNO1 EQ 1 AND COUNT EQ 1 THEN DO ; X = 2 ; Y = TOP2 + 5 * YF + 3 * YF * ALINE ; TEXT = 'STRATUM ' ; OUTPUT ; Y = TOP2 + 4 * YF + 3 * YF * ALINE ; TEXT = '_________' ; OUTPUT ; X = 24 ; Y = TOP2 + 5 * YF + 3 * YF * ALINE ; TEXT = 'TOTAL' ; OUTPUT ; Y = TOP2 + 4 * YF + 3 * YF * ALINE ; TEXT = '______' ; OUTPUT ; X = 32 ; Y = TOP2 + 5 * YF + 3 * YF * ALINE ; TEXT = 'EVENT ' ; OUTPUT ; Y = TOP2 + 4 * YF + 3 * YF * ALINE ; TEXT = '_______' ; OUTPUT ; X = 42 ; Y = TOP2 + 5 * YF + 3 * YF * ALINE ; TEXT = '% EVENT' ; OUTPUT ; Y = TOP2 + 4 * YF + 3 * YF * ALINE ; TEXT = '______' ; OUTPUT ; END ; IF CASE EQ 1 THEN DO ; CASE = CASE + 1 ; Y = TOP2 - YF ; FUNCTION = 'LABEL' ; POSITION = '>' ; X = 2 ; TEXT = 'LOG-RANK P-VALUE' ; OUTPUT ; END ; IF LTEST EQ 1 THEN DO ; Y = TOP2 - YF ; X = 30 ; TEXT = RIGHT(PUT(PVAL, 8.4)) ; OUTPUT ; END ; CASE = CASE + 1 ; IF CASE EQ 4 THEN DO ; SIZE = .1 ; FUNCTION = 'MOVE' ; X = 1 ; Y = 2 ; OUTPUT ; FUNCTION = 'DRAW' ; X = 55 ; Y = 2 ; OUTPUT ; FUNCTION = 'DRAW' ; X = 55 ; Y = TOP + 2 ; OUTPUT ; FUNCTION = 'DRAW' ; X = 1 ; Y = TOP + 2 ; OUTPUT ; FUNCTION = 'DRAW' ; X = 1 ; Y = 2 ; OUTPUT ; END ; IF LASTOBS EQ 1 THEN DO ; IF STRATUM EQ 1 THEN COLOR = 'BLACK' ; IF STRATUM EQ 2 THEN COLOR = 'GRAY8C' ; IF STRATUM EQ 3 THEN COLOR = 'GRAY4F' ; IF STRATUM EQ 4 THEN COLOR = 'GRAYCF' ; XSYS = '2' ; YSYS = '2' ; SIZE = 3.0 ; HSYS = '1' ; STYLE = 'NONE' ; STRATLAB = PUT(STRATUM, &STRATFMT) ; FUNCTION = 'LABEL' ; POSITION = '>' ; TEXT = STRATLAB ; X = TTVAR + .05 ; Y = SURVVAR ; OUTPUT ; END ; RUN ; PROC PRINT DATA = ANNO ; TITLE1 'PRINTOUT OF DATASET ANNO ... ' ; SYMBOL1 I = STEPLJ W = 1.5 H = 1 V = NONE L = 1 C = GRAY00 ; SYMBOL2 I = STEPLJ W = 1.5 H = 1 V = NONE L = 1 C = GRAY8F ; SYMBOL3 I = STEPLJ W = 1.5 H = 1 V = NONE L = 1 C = GRAY4F ; SYMBOL4 I = STEPLJ W = 1.5 H = 1 V = NONE L = 1 C = GRAYCF ; FOOTNOTE2 C = BLACK "macro bwlife.sas (jec) &sysdate &systime" ; LEGEND1 ACROSS = 1 CBORDER = BLACK LABEL = NONE VALUE = (H = 3) POSITION = (TOP RIGHT INSIDE) MODE = PROTECT ; PROC GPLOT DATA = SURCURVE ANNOTATE = ANNO ; PLOT SURVVAR * TTVAR = STRATUM / HAXIS = AXIS1 VAXIS = AXIS2 ; AXIS1 VALUE = (F = NONE H = 2.8 C = BLACK W = 1) C = BLACK ORDER = &TORDER LABEL = (H = 2.5 "&TLABEL" C = BLACK) ; AXIS2 VALUE = (F = NONE H = 2.2 C = BLACK W = 1) C = BLACK ORDER = &CORDER LABEL = (F = NONE H = 2.5 C = BLACK justify = left 'Proportion' justify = left 'No Event') ; TITLE1 C = BLACK H = 3.3 ">ITLE1" ; TITLE2 C = BLACK H = 3.3 ">ITLE2" ; TITLE3 C = BLACK H = 3.3 ">ITLE3" ; FORMAT STRATUM &STRATFMT ; RUN ; %MEND ; *=====================================================================; There are two aspects of this macro that should be mentioned: 1. The statistics box that is printed on the graph. 2. The labelling of the life-table curves. First, the statistics box. SAS produces an output dataset that can be used to generate survival curves in the procedure PROC LIFETEST. This procedure also prints data on three nonparametric tests for comparing two or more survival curves. However, PROC LIFETEST does not provide for an output dataset which includes these statistics. Therefore, for the macro to print the results of these statistical tests, it is necessary to (1) write the printout from PROC LIFETEST to an external file, and (2) get SAS to read the external file and locate the desired statistics. This is accomplished in the following lines from the macro: ---------------------------------------------------------------------------------- PROC PRINTTO PRINT = 'temp.out' NEW ; PROC LIFETEST DATA = GLIFE OUTSURV = SURCURVE NOTABLE ; TIME &TVAR * &CENSVAR(0) ; STRATA STRATUM ; TITLE1 ">ITLE1" ; TITLE2 ">ITLE2" ; TITLE3 ">ITLE3" ; RUN ; PROC PRINTTO ; RUN ; . . . DATA STATS ; RETAIN NGROUP 0 ; INFILE 'temp.out' ; INPUT TEST$@ ; IF TEST EQ 'Log-Rank' THEN DO ; INPUT CHISQ DF PVAL ; LTEST = 1 ; OUTPUT ; END ; IF TEST EQ 'Wilcoxon' THEN DO ; INPUT CHISQ DF PVAL ; LTEST = 2 ; OUTPUT ; END ; IF TEST EQ '-2Log(LR' THEN DO ; INPUT CHISQ DF PVAL ; LTEST = 3 ; OUTPUT ; END ; RUN ; ---------------------------------------------------------------------------------- The 'PROC PRINTTO' command causes output to be redirected to the external file 'temp.out'. This remains in effect until the PROC PRINTTO; RUN; line is encountered. Thus in this case the printout from PROC LIFETEST is redirected to temp.out. The desired statistics from PROC LIFETEST occur in the printout as follows: --------------------------------------------------------------------------------- Test of Equality over Strata Pr > Test Chi-Square DF Chi-Square Log-Rank 7.0023 4 0.1358 Wilcoxon 6.7741 4 0.1483 -2Log(LR) 6.3927 4 0.1717 --------------------------------------------------------------------------------- In the DATA STATS section, the file temp.out is read. The special input statement INPUT TEST$@ ; reads in 8 characters of text (character variables by default are 8 characters long). It reads the first 8 characters on a line which start with a nonblank character. The '@' at the end of the INPUT statement means: keep reading this same line in the next INPUT statement. The first statistic that we want from the printout is the log-rank statistic. This occurs on the file in the line for which 'Log-Rank' are the first 8 nonblank characters. Thus the numbers that are needed from this line are (1) the chi-square value, 7.0023, and (2) the degrees of freedom, 4, and (3) the p-value associated with the chi-square, .1358. Thus when the character variable TEST equals 'Log-Rank', we want to continue reading in that line to obtain the other 3 variables: CHISQ, DF, and PVAL. The values from these three variables are used later in an annotation dataset, ANNO. As in the previous macro, a box is drawn around the statistics obtained from the printout file (see graph). This version of the macro prints only the p-value from the log-rank test; other versions printed statistics from the log-rank, Wilcoxon, and -2Log(LR) [likelihood ratio chi-square] tests. The macro here depends heavily on the predictability of the PROC LIFETEST printout. If SAS were to change the labels on the lines with the test statistics, the macro would need to be modified accordingly. The labelling of the life-table curves is fairly straightforward, except for the positioning of the labels. We want the labels for the curves to occur just to the right of the end of the curves. This is accomplished in the following section of code: --------------------------------------------------------------------------------- IF LASTOBS EQ 1 THEN DO ; IF STRATUM EQ 1 THEN COLOR = 'BLACK' ; IF STRATUM EQ 2 THEN COLOR = 'GRAY8C' ; IF STRATUM EQ 3 THEN COLOR = 'GRAY4F' ; IF STRATUM EQ 4 THEN COLOR = 'GRAYCF' ; XSYS = '2' ; YSYS = '2' ; SIZE = 3.0 ; HSYS = '1' ; STYLE = 'NONE' ; STRATLAB = PUT(STRATUM, &STRATFMT) ; FUNCTION = 'LABEL' ; POSITION = '>' ; TEXT = STRATLAB ; X = TTVAR + .05 ; Y = SURVVAR ; OUTPUT ; --------------------------------------------------------------------------------- The variable 'LASTOBS' occurs on the dataset SURCURVE. It is set equal to zero for all observations except for the last observation in a given stratum. The strata here correspond to the different life table graphs. Thus LASTOBS = 1 indicates that the observation is the last one in that stratum in the output from PROC LIFETEST. The observations are sorted in increasing order of the time to event or time to censoring. Thus LASTOBS = 1 corresponds to the observation in that stratum with the largest event or censoring time. The idea now is to position a label just to the right of that last observation on the graph, and at the height of the y-value of the graph at that time. That is what is done in the code above. See the graph itself for the results. PROBLEM 27 The following data represents body weights for two groups of children: 1 10.25 85 1 10.50 83 1 10.75 83 1 11.00 88 1 11.25 89 1 11.50 81 1 11.75 89 1 12.00 86 1 12.25 88 1 12.50 93 1 12.75 96 1 13.00 94 1 13.25 95 1 13.50 94 1 13.75 102 1 14.00 102 1 14.25 94 1 14.50 104 1 14.75 103 1 15.00 103 2 10.25 92 2 10.50 89 2 10.75 91 2 11.00 97 2 11.25 96 2 11.50 98 2 11.75 103 2 12.00 105 2 12.25 104 2 12.50 104 2 12.75 110 2 13.00 108 2 13.25 115 2 13.50 113 2 13.75 113 2 14.00 117 2 14.25 115 2 14.50 124 2 14.75 128 2 15.00 122 The first column represents the group. The second is age of the child. The third is body weight. Write a macro which will produce a scatterplot of the data, with different symbols plotted for the two groups. The macro should also plot the regression lines of weight versus age, separately for the two groups. The macro should also perform a sequence of tests, based on proc reg or proc glm or proc iml, for the following null hypotheses: H01. The two regression lines have the same slopes and intercepts. H02. The two regression lines have the same intercepts. H03. The two regression lines may have different intercepts, but have the same slopes. The format of the call to this macro should be: %twolines (dataset, where, groupvar, symb1, symb2, xvar, yvar, title1, title2, title3) ; Here symb1 and symb2 are macro parameters which specify what symbol will be plotted for each group. The parameters title1, title2, and title3 are macro parameters for a 3-line title for the graph. /home/walleye/john-c/5421/notes.025 Last update: March 18, 2000