SAS MACROS, Continued                                    SPH 5421 notes.025

SAS Macros - SAS/GRAPH, Continued ...


     Graphs can be highly customized using SAS/GRAPH macros.  The following
is an example of a macro that we have used extensively in the Lung Health Study
for looking at longitudinal data.  The first section below is the part of the
main program that calls the macro.  The second section is the macro itself.
The third section discusses various features of this macro.

*============================================================================== ;

  %include 'gmac1d.sas' ;

  %GREG1D (ENDPTS, WHERE AGE NE ., AGEGE55,
          FOLLOWUP TIME (YEARS),0 TO 4.5 BY .5,
          0,YRSM06,YRSM12,YRSM18,YRSM24,YRSM30,YRSM36,YRSM42,YRSM48,
          MEAN POST-BD FEV1 (L),0.0 TO 2.8 BY .2,
          FEV1200, FEV1206, FEV1212, FEV1218, FEV1224, FEV1230,
          FEV1236, FEV1242, FEV1248,
          LUNG HEALTH STUDY 2, MEANS OF POST-BD FEV1 (L),
          BY AGE GROUP: LT 55 VERSUS GE 55,
          L,G,
          LT 55, GE 55,
          (BOTTOM LEFT INSIDE) OFFSET = (+4, 32), 14, 26) ;

 ENDSAS ;

*============================================================================== ;

%MACRO GREG1D (DATASET, WHERE, CLASSVAR,
               TLABEL, TORDER,
               T1, T2, T3, T4, T5, T6, T7, T8, T9,
               YLABEL, YORDER,
               W1, W2, W3, W4, W5, W6, W7, W8, W9,
               GTITLE1, GTITLE2, GTITLE3,
               SYMB1, SYMB2,
               DESC1, DESC2, POSITLEG, XPOSSTAT, YPOSSTAT) ;

/*****************************************************************************/
/*                                                                           */
/* Macro to graph data for two groups defined by CLASSVAR = 1 or 2.          */
/*                                                                           */
/* The model assumption for the Z-test mentioned below is that the data      */
/* are linear functions of time.                                             */
/*                                                                           */
/* The data are defined at up to 9 time points, T1 - T9.  The outcome        */
/* variables are W1 - W9.  SYMB1 and SYMB2 are symbols for graphing the      */
/* data for group 1 and group 2 respectively.  DESC1 and DESC2 are text      */
/* descriptions of the two groups.  POSITLEG is the position of the LEGEND,  */
/* (see the POSITION and OFFSET descriptions of the LEGEND statement).       */
/* XPOSSTAT and YPOSSTAT are position coordinates for a box containing info  */
/* on the intercepts and slopes and their standard devs for each group,    */
/* and z-statistics for comparisons of the two intercepts and slopes.        */
/*                                                                           */
/* DATASET   : Name of SAS dataset containing the data.                      */
/* CLASSVAR  : Classifying variable: Takes on values 1, 2.                   */
/* TLABEL    : Label for the x-axis (time variable usually)                  */
/* TORDER    : Order statement for the time variable axis (e.g., 0 to 60)    */
/* T1 ... T9 : Names of the time variables ... only 9 allowed.               */
/* YLABEL    : Label for the vertical axis (outcome variable)                */
/* YORDER    : Order statement for the outcome variable axis.                */
/* W1 ... W9 : Names of the outcome variable (one for each time point)       */
/* GTITLE1   : Line 1 of the graph title.                                    */
/* GTITLE2   : Line 2 of the graph title.                                    */
/* GTITLE3   : Line 3 of the graph title.                                    */
/* SYMB1     : Plotting symbol for the first group (CLASSVAR = 1)            */
/* SYMB2     : Plotting symbol for the second group (CLASSVAR = 2)           */
/* DESC1     : Description of first group for the LEGEND box.                */
/* DESC2     : Description of second group for the LEGEND box.               */
/* POSITLEG  : Position of the LEGEND box.  See SAS/GRAPH Vol 1 for syntax.  */
/*             Example: (BOTTOM RIGHT INSIDE) OFFSET = (10, 10)              */
/* XPOSSTAT  : X-axis position of the statistics box: 0-100, as percent.     */
/* YPOSSTAT  : Y-axis position of the stats box.  Note 0 = bottom, 100 = top.*/
/*                                                                           */
/* NOTE: In this version, if either YORDER or TORDER is blank, then the      */
/*       program defaults to using no ORDER statements for either axis.      */
/*       Also this version offsets the t-values by .05 for CLASSVAR = 2.     */
/*                                                                           */
/* NOTE: This version of the program prints standard deviations of           */
/*       slopes and intercepts, not standard errors.                         */
/*                                                                           */
/* NOTE: The WHERE clause has been added.
/*                                                                           */
/*       Update: Feb 24, 1999 J.E.C.                                         */
/*****************************************************************************/


footnote2 'macro greg1d.sas (jec))' ;

PROC SORT DATA = &DATASET ;
     BY &CLASSVAR ;

DATA REG0 ;
     SET &DATASET END = LAST ;
     &WHERE ;
     RETAIN NB 0 B1SUM 0 B12SUM 0 B1HATMIN 9999 B1HATMAX -9999
                 B0SUM 0 B02SUM 0 B0HATMIN 9999 B0HATMAX -9999
                 B01SUM 0 ;
     ARRAY X(9)  X1 - X9 ;
     ARRAY Y(9)  Y1 - Y9 ;
     ARRAY XY(9) XY1 - XY9 ;
     ARRAY XX(9) XX1 - XX9 ;
     ARRAY YY(9) YY1 - YY9 ;

     KEEP  X1  X2  X3  X4  X5  X6  X7  X8  X9
           Y1  Y2  Y3  Y4  Y5  Y6  Y7  Y8  Y9
          XX1 XX2 XX3 XX4 XX5 XX6 XX7 XX8 XX9
          YY1 YY2 YY3 YY4 YY5 YY6 YY7 YY8 YY9
          XY1 XY2 XY3 XY4 XY5 XY6 XY7 XY8 XY9
          NSUM XSUM YSUM X2SUM XYSUM B0HAT B1HAT
          NB B0SUM B02SUM B0HATMIN B0HATMAX
             B1SUM B12SUM B1HATMIN B1HATMAX B01SUM
          &CLASSVAR ;

     VLAST = LAST ;

     I1 = . ; I2 = . ; I3 = . ; I4 = . ;  I5 = . ; I6 = . ;
     I7 = . ; I8 = . ; I9 = . ;

     IF &T1 NE . AND &W1 NE . THEN I1 = 1 ;
     IF &T2 NE . AND &W2 NE . THEN I2 = 1 ;
     IF &T3 NE . AND &W3 NE . THEN I3 = 1 ;
     IF &T4 NE . AND &W4 NE . THEN I4 = 1 ;
     IF &T5 NE . AND &W5 NE . THEN I5 = 1 ;
     IF &T6 NE . AND &W6 NE . THEN I6 = 1 ;
     IF &T7 NE . AND &W7 NE . THEN I7 = 1 ;
     IF &T8 NE . AND &W8 NE . THEN I8 = 1 ;
     IF &T9 NE . AND &W9 NE . THEN I9 = 1 ;

     X1 = I1*&T1; X2 = I2*&T2; X3 = I3*&T3; X4 = I4*&T4; X5 = I5*&T5;
     X6 = I6*&T6; X7 = I7*&T7; X8 = I8*&T8; X9 = I9*&T9;
     Y1 = I1*&W1; Y2 = I2*&W2; Y3 = I3*&W3; Y4 = I4*&W4; Y5 = I5*&W5;
     Y6 = I6*&W6; Y7 = I7*&W7; Y8 = I8*&W8; Y9 = I9*&W9;

     XX1 = X1*X1; XX2 = X2*X2; XX3 = X3*X3; XX4 = X4*X4; XX5 = X5*X5;
     XX6 = X6*X6; XX7 = X7*X7; XX8 = X8*X8; XX9 = X9*X9;

     YY1 = Y1*Y1; YY2 = Y2*Y2; YY3 = Y3*Y3; YY4 = Y4*Y4; YY5 = Y5*Y5;
     YY6 = Y6*Y6; YY7 = Y7*Y7; YY8 = Y8*Y8; YY9 = Y9*Y9;

     XY1 = X1*Y1; XY2 = X2*Y2; XY3 = X3*Y3; XY4 = X4*Y4; XY5 = X5*Y5;
     XY6 = X6*Y6; XY7 = X7*Y7; XY8 = X8*Y8; XY9 = X9*Y9;

     NSUM = N(XY1, XY2, XY3, XY4, XY5, XY6, XY7, XY8, XY9) ;
     XSUM = SUM(X1, X2, X3, X4, X5, X6, X7, X8, X9) ;
     YSUM = SUM(Y1, Y2, Y3, Y4, Y5, Y6, Y7, Y8, Y9) ;
     X2SUM = SUM(XX1, XX2, XX3, XX4, XX5, XX6, XX7, XX8, XX9) ;
     XYSUM = SUM(XY1, XY2, XY3, XY4, XY5, XY6, XY7, XY8, XY9) ;

     B1HAT = . ;
     TOP = XYSUM - XSUM * YSUM / NSUM ;
     BOT = X2SUM - XSUM * XSUM / NSUM ;
     IF BOT GT 0 THEN B1HAT = TOP / BOT ;

     B0HAT = . ;
     XAVE = . ; IF NSUM GT 0 THEN XAVE = XSUM / NSUM ;
     YAVE = . ; IF NSUM GT 0 THEN YAVE = YSUM / NSUM ;
     IF YAVE NE 0 THEN B0HAT = YAVE - B1HAT * XAVE ;

     IF B1HAT NE . THEN DO ;
        NB = NB + 1 ;
        B0SUM = B0SUM + B0HAT ;
        B02SUM = B02SUM + B0HAT * B0HAT ;
        B01SUM = B01SUM + B0HAT * B1HAT ;
        IF B0HAT LT B0HATMIN THEN B0HATMIN = B0HAT ;
        IF B0HAT GT B0HATMAX THEN B0HATMAX = B0HAT ;

        B1SUM = B1SUM + B1HAT ;
        B12SUM = B12SUM + B1HAT * B1HAT ;
        IF B1HAT LT B1HATMIN THEN B1HATMIN = B1HAT ;
        IF B1HAT GT B1HATMAX THEN B1HATMAX = B1HAT ;
     END ;

/*PROC PRINT DATA = REG0 ;                        */
/*     WHERE B1HAT NE . AND B1HAT LT -10 ;        */
/*     VAR LHS2ID X1 X2 X3 X4 X5 X6 X7 X8 X9      */
/*                Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9      */
/*                NSUM XSUM YSUM X2SUM XYSUM      */
/*                B1HAT B0HAT ;                   */

PROC MEANS DATA = REG0 N MEAN STD STDERR MIN MAX ;
     CLASS &CLASSVAR ;
     VAR B0HAT B1HAT X1 X2 X3 X4 X5 X6 X7 X8 X9 
                     Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 ;
     OUTPUT OUT = BMEANS
                  MEAN   = B0HAT   B1HAT
                  STDERR = SEB0HAT SEB1HAT
                  STD    = SDB0HAT SDB1HAT ;

PROC PRINT DATA = BMEANS ;
TITLE1 'PROC PRINT FOR DATASET BMEANS' ;

DATA ANNOB0 ;
     RETAIN B0HAT1 SEB0HAT1 SDB0HAT1 VB0HAT1
            B1HAT1 SEB1HAT1 SDB1HAT1 VB1HAT1 ;
     LENGTH TEXT $72 ;
     SET BMEANS ;
         IF &CLASSVAR EQ 1 THEN DO ;
            B0HAT1 = B0HAT ;
            SEB0HAT1 = SEB0HAT ;
            SDB0HAT1 = SDB0HAT ;
            VB0HAT1 = SEB0HAT1 * SEB0HAT1 ;

            B1HAT1 = B1HAT ;
            SEB1HAT1 = SEB1HAT ;
            SDB1HAT1 = SDB1HAT ;
            VB1HAT1 = SEB1HAT1 * SEB1HAT1 ;
         END ;
         IF &CLASSVAR EQ 2 THEN DO ;
            B0HAT2 = B0HAT ;
            SEB0HAT2 = SEB0HAT ;
            SDB0HAT2 = SDB0HAT ;
            VB0HAT2 = SEB0HAT2 * SEB0HAT2 ;
            VARCOMB0 = VB0HAT1 + VB0HAT2 ;
            SERCOMB0 = SQRT(VARCOMB0) ;
            A0DIFF12 = ABS(B0HAT1 - B0HAT2) ;
            Z0DIFF12 = A0DIFF12 / SERCOMB0 ;
            P0DIFF12 = 2 * (1 - PROBNORM(Z0DIFF12)) ;

            B1HAT2 = B1HAT ;
            SEB1HAT2 = SEB1HAT ;
            SDB1HAT2 = SDB1HAT ;
            VB1HAT2 = SEB1HAT2 * SEB1HAT2 ;
            VARCOMB1 = VB1HAT1 + VB1HAT2 ;
            SERCOMB1 = SQRT(VARCOMB1) ;
            A1DIFF12 = ABS(B1HAT1 - B1HAT2) ;
            Z1DIFF12 = A1DIFF12 / SERCOMB1 ;
            P1DIFF12 = 2 * (1 - PROBNORM(Z1DIFF12)) ;

            FUNCTION = 'LABEL' ; POSITION = '>' ;
            X = &XPOSSTAT ; Y = &YPOSSTAT ;
            TEXT = 'REGRESSIONS (TWO STRAIGHT LINES): MEANS +/- STD DEV' ; OUTPUT ;

            Y = Y - 2 ;
            FUNCTION = 'LABEL' ; POSTION = '>' ;
            TEXTA = LEFT(PUT(B0HAT1, 7.3)) ;
            TEXTB = LEFT(PUT(SDB0HAT1, 7.3)) ;
            TEXTC = LEFT(PUT(B1HAT1, 7.3)) ;
            TEXTD = LEFT(PUT(SDB1HAT1, 7.3)) ;
            TEXTE = "&SYMB1" || ' INTCPT: ' || TEXTA || ' +/- ' || TEXTB ;
            TEXT =    TEXTE  || ' SLOPE: '  || TEXTC || ' +/- ' || TEXTD ;
            OUTPUT ;

            Y = Y - 2 ;
            FUNCTION = 'LABEL' ; POSITION = '>' ;
            TEXTA = LEFT(PUT(B0HAT2, 7.3)) ;
            TEXTB = LEFT(PUT(SDB0HAT2, 7.3)) ;
            TEXTC = LEFT(PUT(B1HAT2, 7.3)) ;
            TEXTD = LEFT(PUT(SDB1HAT2, 7.3)) ;
            TEXTE = "&SYMB2" || ' INTCPT: ' || TEXTA || ' +/- ' || TEXTB ;
            TEXT =     TEXTE || ' SLOPE: '  || TEXTC || ' +/- ' || TEXTD ;
            OUTPUT ;

            Y = Y - 3 ;
            FUNCTION = 'LABEL' ; POSITION = '>' ;
            TEXTC = LEFT(PUT(Z0DIFF12, 6.3)) ;
            TEXTD = LEFT(PUT(P0DIFF12, 7.4)) ;
            TEXT = 'INTCPT Z-STAT = ' || TEXTC || ' P-VALUE = ' || TEXTD ; OUTPUT ;

            Y = Y - 2 ;
            FUNCTION = 'LABEL' ; POSITION = '>' ;
            TEXTC = LEFT(PUT(Z1DIFF12, 6.3)) ;
            TEXTD = LEFT(PUT(P1DIFF12, 7.4)) ;
            TEXT = 'SLOPE  Z-STAT = ' || TEXTC || ' P-VALUE = ' || TEXTD ; OUTPUT ;

            FUNCTION = 'MOVE' ; X = &XPOSSTAT - 1 ; Y = &YPOSSTAT + 2 ; OUTPUT ;
            FUNCTION = 'DRAW' ; X = X + 40        ; Y = Y             ; OUTPUT ;
            FUNCTION = 'DRAW' ; X = X             ; Y = Y - 13        ; OUTPUT ;
            FUNCTION = 'DRAW' ; X = X - 40        ; Y = Y             ; OUTPUT ;
            FUNCTION = 'DRAW' ; X = X             ; Y = Y + 13        ; OUTPUT ;

         END ;

DATA BYTIMES ;
     SET REG0 ;
     KEEP &CLASSVAR TIME XVAR YVAR X1 X2 X3 Y1 Y2 Y3 ;
     ARRAY X(9)  X1 - X9 ;
     ARRAY Y(9)  Y1 - Y9 ;
     ARRAY XY(9) XY1 - XY9 ;
     ARRAY XX(9) XX1 - XX9 ;
     ARRAY YY(9) YY1 - YY9 ;

     DO I = 1 TO 9 ;

        TIME = I ;
        XVAR = X(I) ; YVAR = Y(I) ;
        IF XY(I) NE . THEN OUTPUT ;

     END ;

PROC MEANS DATA = REG0 N MEAN STD STDERR MIN MAX ;
     CLASS &CLASSVAR ;
     VAR B0HAT B1HAT X1 X2 X3 Y1 Y2 Y3 ;
TITLE1 'MEANS OF REG0 AFTER BYTIMES WAS CREATED ...' ;

PROC MEANS DATA = BYTIMES  N MEAN STD STDERR MIN MAX ;
     CLASS &CLASSVAR TIME ;
     VAR   XVAR YVAR X1 X2 X3 Y1 Y2 Y3 ;
     OUTPUT OUT    = GMEANS
            N      = NX NY
            MEAN   = XMEAN YMEAN
            STDERR = XSTDERR YSTDERR ;
TITLE1 ">ITLE1" ;
TITLE2 ">ITLE2" ;
TITLE3 ">ITLE3" ;

DATA ANNONUMS ;
     SET GMEANS ;
     IF &CLASSVAR EQ . OR TIME EQ . THEN DELETE ;

     FUNCTION = 'LABEL' ;
     XSYS = '2' ; YSYS = '1' ; POSITION = '6' ; SIZE = 1 ;

     IF TIME EQ 1 THEN DO ;
       X = XMEAN ; Y = 11 - 3 * &CLASSVAR ;
       TEXTA = "&SYMB1" || ': ' ;
       IF &CLASSVAR EQ 2 THEN TEXTA = "&SYMB2" || ': ' ;
       TEXTB = LEFT(PUT(NX, 4.)) ;
       TEXT = TEXTA || TEXTB ;
       END ;
     ELSE IF TIME GT 1 THEN DO ;
       X = XMEAN ;
       Y = 11 - 3 * &CLASSVAR ;
       TEXT = LEFT(PUT(NX, 4.)) ;
     END;

DATA ANNOTOT ;
     SET ANNOB0 ANNONUMS ;

PROC PRINT DATA = ANNOTOT ;

DATA TRIPLES ;
     SET GMEANS ;
     IF &CLASSVAR EQ . OR TIME EQ . THEN DELETE ;
     XV = XMEAN + .05 * (&CLASSVAR - 1) ;
     YV = YMEAN + 2 * YSTDERR ; OUTPUT ;
     YV = YMEAN - 2 * YSTDERR ; OUTPUT ;
     YV = YMEAN               ; OUTPUT ;

PROC PRINT DATA = TRIPLES ;

SYMBOL1 I = HILOCTJ W = 2  H = 2 L = 1 V = &SYMB1  C = GRAY00 F = SWISS ;
SYMBOL2 I = HILOCTJ W = 3  H = 2 L =33 V = &SYMB2  C = GRAY80 F = SWISS ;

LEGEND1 ACROSS = 1  FRAME  LABEL = NONE
        VALUE = (TICK = 1 H = 2 C = BLACK  "&DESC1"
                 TICK = 2 H = 2 C = GRAY80 "&DESC2")
        POSITION = &POSITLEG MODE = PROTECT ;

%IF %LENGTH(&YORDER) EQ 0 %THEN %GOTO VERS2 ;
%IF %LENGTH(&TORDER) EQ 0 %THEN %GOTO VERS2 ;

PROC GPLOT DATA = TRIPLES  ANNOTATE = ANNOTOT ;
     PLOT YV * XV = &CLASSVAR / FRAME LEGEND = LEGEND1
          HAXIS = AXIS1  VAXIS = AXIS2 ;
          AXIS1 COLOR = BLACK VALUE = (F = SWISS H = 2)
                ORDER  = &TORDER
                LABEL = (C = BLACK  F = SWISS H = 2.5
                         "&TLABEL") ;
          AXIS2 COLOR = BLACK VALUE = (F = SWISS H = 2)
                ORDER = &YORDER
                LABEL = (C = BLACK A = 90 F = SWISS H = 2.5
                         "&YLABEL") ;
 TITLE1 J = C H = 2.5 F = SWISS  ">ITLE1" ;
 TITLE2 J = C H = 2.5 F = SWISS  ">ITLE2" ;
 TITLE3 J = C H = 2.5 F = SWISS  ">ITLE3" ;
 RUN ;

%GOTO ENDJUMP ;

%VERS2: ;

PROC GPLOT DATA = TRIPLES  ANNOTATE = ANNOTOT ;
     PLOT YV * XV = &CLASSVAR / FRAME LEGEND = LEGEND1
          HAXIS = AXIS1  VAXIS = AXIS2 ;
          AXIS1 COLOR = BLACK VALUE = (F = SWISS H = 2)
                LABEL = (C = BLACK  F = SWISS H = 2.5
                         "&TLABEL") ;
          AXIS2 COLOR = BLACK VALUE = (F = SWISS H = 2)
                LABEL = (C = BLACK A = 90 F = SWISS H = 2.5
                         "&YLABEL") ;
 TITLE1 J = C H = 2.5 F = SWISS  ">ITLE1" ;
 TITLE2 J = C H = 2.5 F = SWISS  ">ITLE2" ;
 TITLE3 J = C H = 2.5 F = SWISS  ">ITLE3" ;
 RUN ;

%ENDJUMP: ;

%MEND ;

* =========================================================================== ;

     The macro greg1d above includes several features that may be useful:

     1.  The first parameter, DATASET, specifies which SAS dataset the
         macro will be dealing with.

     2.  The second parameter, WHERE, is intended to be a phrase which can
         define a subset of the dataset; for example, in the calling program,
         WHERE is specified as

                  WHERE AGE NE .

     3.  A number of arrays are defined internally in the macro.  These are
         used to store values which are used in computing slopes and intercepts
         of lines.  Putting the values in arrays simplifies the calculations.

     4.  The OUTPUT OUT = BMEANS statement in PROC MEANS includes options for
         putting the means, standard errors, and standard deviations in the
         dataset BMEANS.  These are used later in computing confidence intervals
         for the means of the outcome variable (W1, W2, ..., W9) at each
         time point.

     5.  A complicated annotation dataset 'ANNOB0' is constructed, based on statistics
         in the dataset BMEANS.  The data from the ANNOTATE dataset is used
         later in PROC GPLOT to print the summary statistics in a box on the
         graph.  Most of the content of ANNOTATE is text about the statistics
         that goes in the box.  The FUNCTION = 'LABEL' statements cause output
         of the text.

         The dataset ANNOB0 also contains instructions to draw a box around
         around the statistics summary.  This is accomplished by using
         FUNCTION = 'DRAW' and FUNCTION = 'MOVE' statements.

     6.  A second annotation dataset, ANNONUMS, is constructed to produce the
         counts of observations in the two groups at each time point (printed
         along the bottom of the graph; see graph).

     7.  Both of the annotate datasets are put together in one dataset, ANNOTOT.
         ANNOTOT is what is used the PROC GPLOT line:

                 PROC GPLOT DATA = TRIPLES  ANNOTATE = ANNOTOT ;

     8.  The dataset TRIPLES is based on output from a PROC MEANS procedure.
         It is comprised of the mean values for the x-variable and the y-variable
         at each time point.  It actually includes three y observations for each
         time point: the mean, the mean + 2 std errs, and the mean - 2 std errs.
         The SYMBOL statements,

         SYMBOL1 I = HILOCTJ W = 2 H = 2 L = 1 V = &SYMB1 C = GRAY00 F = SWISS;
         SYMBOL2 I = HILOCTJ W = 3 H = 2 L = 1 V = &SYMB2 C = GRAY80 F = SWISS;

         tell GPLOT to plot the mean value and error bars above and below the
         mean, and join the successive means together (HILOCTJ).  Note that
         the plotting symbols &SYMB1 and &SYMB2 are specified in the call to
         the macro.


     This is admittedly a complex and difficult macro.  It shows some of the
capabilities of SAS/GRAPH, but it also indicates that you need to know quite a
few details of how SAS/GRAPH works to write something like this.

     The next program, bwlife.sas, produces life-table graphs:

==================================================================================

%MACRO BWLIFE (DATASET, WHERE, NSTRATA, STRATVAR, STRATFMT,
              TVAR, TLABEL, TORDER,
              CENSVAR, CENSLABL, CORDER,
              GTITLE1, GTITLE2, GTITLE3) ;

/*****************************************************************************/
/*                                                                           */
/* Macro to graph survival data for several strata, and print the            */
/* relevant statistics (Log-rank, Wilcoxon, -2 * Log Likelihood).            */
/*                                                                           */
/* DATASET   : Name of SAS dataset containing the data                       */
/* WHERE     : Where statement, such as  WHERE X GT 0                        */
/* NSTRATA   : Number of strata - no more than 9, for labels to work.        */
/* STRATVAR  : Stratifying variable [class variable]                         */
/* STRATFMT  : Format definition for stratifying variable                    */
/*             Note: formats should be 22 characters or less.                */
/* TVAR      : Time variable                                                 */
/* TLABEL    : Label (goes below horiz axis) for Time variable               */
/* TORDER    : Order statement for Time variable                             */
/*                                                                           */
/* CENSVAR   : Censoring variable: value 0 indicates censored obsn,          */
/*             and 1 indicates event.                                        */
/* CENSLABL  : Label for censoring variable (vertical axis)                  */
/* CORDER    : Order statement for vertical axis                             */
/*                                                                           */
/* GTITLE1   : 1st line of graph Title                                       */
/* GTITLE2   : 2nd line of graph Title                                       */
/* GTITLE3   : 3rd line of graph Title                                       */
/*                                                                           */
/* Author    : J. Connett.  Last modify date: 11-27-99.                      */
/*                                                                           */
/*             Note: This macro is like dlife.sas but black & white.         */
/*                                                                           */
/*             This is also similar to clife.sas, except the vertical        */
/*             label is always 'Proportion | No Event'.                      */
/*                                                                           */
/*****************************************************************************/

*=====================================================================;

GOPTIONS
         RESET = GLOBAL
         VPOS =  78 HPOS =  85
         HSIZE = 21 cm
         VSIZE = 14 cm
         ROTATE = LANDSCAPE
         FTEXT = NONE
         DEVICE = PS
         GACCESS = SASGASTD
         GSFNAME = GRAPH
         GUNIT = PCT BORDER
         CBACK = WHITE
         CTEXT = BLACK
         FTEXT = NONE
         CHARTYPE = 11
         HTITLE = 1.5 HTEXT = 1.5 ;

*=====================================================================;

DATA GLIFE ;
     SET &DATASET ;
     &WHERE ;
     KEEP &TVAR &CENSVAR &STRATVAR STRATLAB STRATUM ;
     STRATLAB = PUT(&STRATVAR, &STRATFMT) ;
     STRATUM = &STRATVAR ;

PROC MEANS DATA = GLIFE N SUM MEAN STD STDERR MIN MAX ;
     WHERE STRATUM NE . AND &TVAR GE 0 AND &CENSVAR GE 0 ;
     CLASS STRATLAB ;
     VAR &TVAR &CENSVAR ;
     OUTPUT OUT = GMEANS
            N = NTVAR NCENSVAR
            SUM = STVAR SCENSVAR
            MEAN = MTVAR MCENSVAR ;

DATA ANNO1 ;
     RETAIN COUNT 0 ;
     SET GMEANS ;
     WHERE _TYPE_ EQ 1 AND NTVAR NE . AND NCENSVAR NE . ;
     ANNO1 = 1 ;
     FAILED = SCENSVAR ;
     TOTAL = NCENSVAR ;
     PCTFAIL = 100 * FAILED / TOTAL ;
     COUNT = COUNT + 1 ;

PROC SORT DATA = ANNO1 ; BY DESCENDING COUNT ;

PROC PRINT DATA = ANNO1 ;
TITLE1 'PRINTOUT OF DATASET ANNO1 SORTED BY COUNT, DESCENDING' ;

PROC PRINTTO PRINT = 'temp.out' NEW ;

PROC LIFETEST DATA = GLIFE OUTSURV = SURCURVE NOTABLE ;
     TIME &TVAR * &CENSVAR(0) ;
     STRATA STRATUM ;
TITLE1 ">ITLE1" ;
TITLE2 ">ITLE2" ;
TITLE3 ">ITLE3" ;
RUN ;

PROC PRINTTO ; RUN ;

DATA SURCURVE ;
     RETAIN TMAX 0 ;
     SET SURCURVE ;
     IF _CENSOR_ NE 0 THEN DELETE ;
     IF &TVAR GT TMAX THEN TMAX = &TVAR ;
     UMAX = 1 ;

/* PROC PRINT DATA = SURCURVE ;   */
TITLE1 'FIRST PRINT OF SURCURVE ...' ;

DATA YMAX ;
     KEEP STRATUM TMAX ;
     SET SURCURVE ;
     BY UMAX ;
     IF LAST.UMAX NE 1 THEN DELETE ;

     DO ISTRAT = 1 TO &NSTRATA ;
        STRATUM = ISTRAT ;
        OUTPUT ;
     END ;

/* PROC PRINT DATA = YMAX ;  */

PROC SORT DATA = SURCURVE ;
     BY STRATUM &TVAR ;

DATA SURCURVE ;
     RETAIN TTVAR SURVVAR ;
     SET SURCURVE YMAX ;
     BY STRATUM ;
     LASTOBS = 0 ;
     IF &TVAR NE . THEN TTVAR = &TVAR ;
     IF &TVAR EQ . THEN TTVAR = TMAX ;
     IF &TVAR NE . THEN SURVVAR = SURVIVAL ;
     IF LAST.STRATUM EQ 1 THEN LASTOBS = 1 ;

/* PROC PRINT DATA = SURCURVE ; */
TITLE1 'SECOND PRINT OF SURCURVE ...' ;

DATA STATS ;
  RETAIN NGROUP 0 ;
  INFILE 'temp.out' ;
  INPUT TEST$@ ;

  IF TEST EQ 'Log-Rank' THEN DO ;
     INPUT CHISQ DF PVAL ;
     LTEST = 1 ; OUTPUT ;
  END ;

  IF TEST EQ 'Wilcoxon' THEN DO ;
     INPUT CHISQ DF PVAL ;
     LTEST = 2 ; OUTPUT ;
  END ;

  IF TEST EQ '-2Log(LR' THEN DO ;
     INPUT CHISQ DF PVAL ;
     LTEST = 3 ; OUTPUT ;
  END ;

  RUN ;

PROC SORT DATA = STATS ; BY LTEST ;

/*  PROC PRINT DATA = STATS ; */
TITLE1 'PRINTOUT OF DATASET STATS, SORTED BY LTEST' ;

DATA ANNO ;
     RETAIN CASE 0  Y  TOP ALINE 0 MAXCOUNT 0 ;
     LENGTH TEXT $72  TEXTE $7 TEXTF $8 TEXTG $9 TEXTH $22 ;
     SET ANNO1 STATS SURCURVE ;

     COLOR = 'BLACK' ;
     XSYS = '1' ; YSYS = '1' ; SIZE = 3.00 ; HSYS = '1' ;
     STYLE = 'NONE' ;
     YF = 2.0 ;
     TOP2 = 5 * YF ;

     IF MAXCOUNT LT COUNT THEN MAXCOUNT = COUNT ;

     IF ANNO1 EQ 1 THEN DO ;

        FUNCTION = 'LABEL' ; POSITION = '>' ;
        ALINE = ALINE + 1 ;
        X = 2 ; Y = TOP2 + YF + 3 * YF * ALINE ;
        TEXT = LEFT(PUT(STRATLAB, CHAR25.)) ; OUTPUT ;

        X = 24 ;
        TEXT = RIGHT(PUT(TOTAL,  7.0)) ;      OUTPUT ;

        X = 31 ;
        TEXT  = RIGHT(PUT(FAILED,  8.0)) ;    OUTPUT ;

        X = 40 ;
        TEXT = RIGHT(PUT(PCTFAIL, 9.2)) ;     OUTPUT ;

        X = 47 ;
        TEXT = ' %' ;                         OUTPUT ;

     END ;

     TOP = TOP2 + 7 * YF + 3 * YF * MAXCOUNT ;

     IF ANNO1 EQ 1 AND COUNT EQ 1 THEN DO ;

       X = 2 ; Y = TOP2 + 5 * YF + 3 * YF * ALINE ;
       TEXT = 'STRATUM ' ;                    OUTPUT ;
       Y = TOP2 + 4 * YF + 3 * YF * ALINE ;
       TEXT = '_________' ;                   OUTPUT ;

       X = 24 ; Y = TOP2 + 5 * YF + 3 * YF * ALINE ;
       TEXT = 'TOTAL' ;                       OUTPUT ;
       Y = TOP2 + 4 * YF + 3 * YF * ALINE ;
       TEXT = '______' ;                      OUTPUT ;

       X = 32 ; Y = TOP2 + 5 * YF + 3 * YF * ALINE ;
       TEXT = 'EVENT ' ;                      OUTPUT ;
       Y = TOP2 + 4 * YF + 3 * YF * ALINE ;
       TEXT = '_______' ;                      OUTPUT ;

       X = 42 ; Y = TOP2 + 5 * YF + 3 * YF * ALINE ;
       TEXT = '% EVENT' ;                      OUTPUT ;
       Y = TOP2 + 4 * YF + 3 * YF * ALINE ;
       TEXT = '______' ;                      OUTPUT ;

     END ;

     IF CASE EQ 1 THEN DO ;

        CASE = CASE + 1 ;

        Y = TOP2 - YF ;
        FUNCTION = 'LABEL' ; POSITION = '>' ;
        X =  2 ; TEXT = 'LOG-RANK P-VALUE' ; OUTPUT ;

     END ;

     IF LTEST EQ 1 THEN DO ;

        Y = TOP2 - YF ;
        X =  30 ; TEXT = RIGHT(PUT(PVAL, 8.4)) ;   OUTPUT ;

     END ;

     CASE = CASE + 1 ;
     IF CASE EQ 4 THEN DO ;

       SIZE = .1 ;
       FUNCTION = 'MOVE' ; X = 1      ; Y = 2        ; OUTPUT ;
       FUNCTION = 'DRAW' ; X = 55     ; Y = 2        ; OUTPUT ;
       FUNCTION = 'DRAW' ; X = 55     ; Y = TOP  + 2 ; OUTPUT ;
       FUNCTION = 'DRAW' ; X = 1      ; Y = TOP  + 2 ; OUTPUT ;
       FUNCTION = 'DRAW' ; X = 1      ; Y = 2        ; OUTPUT ;

     END ;

     IF LASTOBS EQ 1 THEN DO ;

        IF STRATUM EQ 1 THEN COLOR = 'BLACK'  ;
        IF STRATUM EQ 2 THEN COLOR = 'GRAY8C' ;
        IF STRATUM EQ 3 THEN COLOR = 'GRAY4F' ;
        IF STRATUM EQ 4 THEN COLOR = 'GRAYCF' ;
        XSYS = '2' ; YSYS = '2' ; SIZE = 3.0 ; HSYS = '1' ;
        STYLE = 'NONE' ;
        STRATLAB = PUT(STRATUM, &STRATFMT) ;
        FUNCTION = 'LABEL' ; POSITION = '>' ;
        TEXT = STRATLAB ;
        X = TTVAR + .05 ; Y = SURVVAR ;
        OUTPUT ;

     END ;

RUN ;

PROC PRINT DATA = ANNO ;
TITLE1 'PRINTOUT OF DATASET ANNO ... ' ;

SYMBOL1 I = STEPLJ W = 1.5  H = 1 V = NONE L =  1 C = GRAY00  ;
SYMBOL2 I = STEPLJ W = 1.5  H = 1 V = NONE L =  1 C = GRAY8F  ;
SYMBOL3 I = STEPLJ W = 1.5  H = 1 V = NONE L =  1 C = GRAY4F  ;
SYMBOL4 I = STEPLJ W = 1.5  H = 1 V = NONE L =  1 C = GRAYCF  ;

FOOTNOTE2 C = BLACK "macro bwlife.sas (jec) &sysdate &systime" ;

 LEGEND1 ACROSS = 1  CBORDER = BLACK  LABEL = NONE
         VALUE = (H = 3)
         POSITION = (TOP RIGHT INSIDE) MODE = PROTECT ;

PROC GPLOT DATA = SURCURVE ANNOTATE = ANNO ;
     PLOT SURVVAR * TTVAR = STRATUM /
          HAXIS = AXIS1  VAXIS = AXIS2 ;
          AXIS1 VALUE = (F = NONE H = 2.8 C = BLACK W = 1)
                C = BLACK ORDER = &TORDER
                LABEL = (H = 2.5 "&TLABEL" C = BLACK) ;
          AXIS2 VALUE = (F = NONE H = 2.2 C = BLACK W = 1)
                C = BLACK ORDER = &CORDER
                LABEL = (F = NONE H = 2.5 C = BLACK
                        justify = left 'Proportion'
                        justify = left 'No Event') ;
TITLE1 C = BLACK   H = 3.3 ">ITLE1" ;
TITLE2 C = BLACK   H = 3.3 ">ITLE2" ;
TITLE3 C = BLACK   H = 3.3 ">ITLE3" ;
FORMAT STRATUM &STRATFMT ;
RUN ;

%MEND ;
*=====================================================================;

     There are two aspects of this macro that should be mentioned:

     1.  The statistics box that is printed on the graph.

     2.  The labelling of the life-table curves.

     First, the statistics box.  SAS produces an output dataset that can be used
to generate survival curves in the procedure PROC LIFETEST.  This procedure also
prints data on three nonparametric tests for comparing two or more survival
curves.  However, PROC LIFETEST does not provide for an output dataset which
includes these statistics.

     Therefore, for the macro to print the results of these statistical tests,
it is necessary to (1) write the printout from PROC LIFETEST to an external
file, and (2) get SAS to read the external file and locate the desired statistics.

     This is accomplished in the following lines from the macro:

----------------------------------------------------------------------------------

     PROC PRINTTO PRINT = 'temp.out' NEW ;

     PROC LIFETEST DATA = GLIFE OUTSURV = SURCURVE NOTABLE ;
          TIME &TVAR * &CENSVAR(0) ;
          STRATA STRATUM ;
     TITLE1 ">ITLE1" ;
     TITLE2 ">ITLE2" ;
     TITLE3 ">ITLE3" ;
     RUN ;

     PROC PRINTTO ; RUN ;
       .
       .
       .

     DATA STATS ;
       RETAIN NGROUP 0 ;
       INFILE 'temp.out' ;
       INPUT TEST$@ ;

       IF TEST EQ 'Log-Rank' THEN DO ;
          INPUT CHISQ DF PVAL ;
          LTEST = 1 ; OUTPUT ;
       END ;

       IF TEST EQ 'Wilcoxon' THEN DO ;
          INPUT CHISQ DF PVAL ;
          LTEST = 2 ; OUTPUT ;
       END ;

       IF TEST EQ '-2Log(LR' THEN DO ;
          INPUT CHISQ DF PVAL ;
          LTEST = 3 ; OUTPUT ;
       END ;

       RUN ;

----------------------------------------------------------------------------------

     The 'PROC PRINTTO' command causes output to be redirected to the external
file 'temp.out'.  This remains in effect until the PROC PRINTTO; RUN; line
is encountered.  Thus in this case the printout from PROC LIFETEST is redirected
to temp.out.

     The desired statistics from PROC LIFETEST occur in the printout as follows:

---------------------------------------------------------------------------------

                    Test of Equality over Strata

                                              Pr >
               Test      Chi-Square    DF  Chi-Square

               Log-Rank      7.0023     4      0.1358
               Wilcoxon      6.7741     4      0.1483
               -2Log(LR)     6.3927     4      0.1717

---------------------------------------------------------------------------------


     In the DATA STATS section, the file temp.out is read.  The special input
statement

     INPUT TEST$@ ;

reads in 8 characters of text (character variables by default are 8 characters
long).  It reads the first 8 characters on a line which start with a nonblank
character.  The '@' at the end of the INPUT statement means: keep reading this
same line in the next INPUT statement.

     The first statistic that we want from the printout is the log-rank statistic.
This occurs on the file in the line for which 'Log-Rank' are the first 8 nonblank
characters.  Thus the numbers that are needed from this line are (1) the
chi-square value, 7.0023, and (2) the degrees of freedom, 4, and (3) the p-value
associated with the chi-square, .1358.

     Thus when the character variable TEST equals 'Log-Rank', we want to continue
reading in that line to obtain the other 3 variables: CHISQ, DF, and PVAL.
The values from these three variables are used later in an annotation dataset,
ANNO.  As in the previous macro, a box is drawn around the statistics obtained
from the printout file (see graph).  This version of the macro prints only
the p-value from the log-rank test; other versions printed statistics from
the log-rank, Wilcoxon, and -2Log(LR) [likelihood ratio chi-square] tests.

     The macro here depends heavily on the predictability of the PROC LIFETEST
printout.  If SAS were to change the labels on the lines with the test
statistics, the macro would need to be modified accordingly.



     The labelling of the life-table curves is fairly straightforward, except for
the positioning of the labels.  We want the labels for the curves to occur just
to the right of the end of the curves.  This is accomplished in the following
section of code:

---------------------------------------------------------------------------------

     IF LASTOBS EQ 1 THEN DO ;

        IF STRATUM EQ 1 THEN COLOR = 'BLACK'  ;
        IF STRATUM EQ 2 THEN COLOR = 'GRAY8C' ;
        IF STRATUM EQ 3 THEN COLOR = 'GRAY4F' ;
        IF STRATUM EQ 4 THEN COLOR = 'GRAYCF' ;
        XSYS = '2' ; YSYS = '2' ; SIZE = 3.0 ; HSYS = '1' ;
        STYLE = 'NONE' ;
        STRATLAB = PUT(STRATUM, &STRATFMT) ;
        FUNCTION = 'LABEL' ; POSITION = '>' ;
        TEXT = STRATLAB ;
        X = TTVAR + .05 ; Y = SURVVAR ;
        OUTPUT ;

---------------------------------------------------------------------------------

     The variable 'LASTOBS' occurs on the dataset SURCURVE.  It is set equal to
zero for all observations except for the last observation in a given stratum.  The
strata here correspond to the different life table graphs.  Thus LASTOBS = 1
indicates that the observation is the last one in that stratum in the output
from PROC LIFETEST.  The observations are sorted in increasing order of the time
to event or time to censoring.  Thus LASTOBS = 1 corresponds to the observation
in that stratum with the largest event or censoring time.

     The idea now is to position a label just to the right of that last
observation on the graph, and at the height of the y-value of the graph at that
time.  That is what is done in the code above.  See the graph itself for the
results.

PROBLEM 27

     The following data represents body weights for two groups of children:

  1    10.25       85
  1    10.50       83
  1    10.75       83
  1    11.00       88
  1    11.25       89
  1    11.50       81
  1    11.75       89
  1    12.00       86
  1    12.25       88
  1    12.50       93
  1    12.75       96
  1    13.00       94
  1    13.25       95
  1    13.50       94
  1    13.75      102
  1    14.00      102
  1    14.25       94
  1    14.50      104
  1    14.75      103
  1    15.00      103
  2    10.25       92
  2    10.50       89
  2    10.75       91
  2    11.00       97
  2    11.25       96
  2    11.50       98
  2    11.75      103
  2    12.00      105
  2    12.25      104
  2    12.50      104
  2    12.75      110
  2    13.00      108
  2    13.25      115
  2    13.50      113
  2    13.75      113
  2    14.00      117
  2    14.25      115
  2    14.50      124
  2    14.75      128
  2    15.00      122

     The first column represents the group.  The second is age of the child.
The third is body weight.

     Write a macro which will produce a scatterplot of the data, with
different symbols plotted for the two groups.  The macro should also plot
the regression lines of weight versus age, separately for the two groups.
The macro should also perform a sequence of tests, based on proc reg or
proc glm or proc iml, for the following null hypotheses:

    H01.  The two regression lines have the same slopes and intercepts.

    H02.  The two regression lines have the same intercepts.

    H03.  The two regression lines may have different intercepts, but
          have the same slopes.

    The format of the call to this macro should be:

    %twolines (dataset, where, groupvar, symb1, symb2,
               xvar, yvar, title1, title2, title3) ;

    Here symb1 and symb2 are macro parameters which specify what symbol will
be plotted for each group.  The parameters title1, title2, and title3 are
macro parameters for a 3-line title for the graph.

/home/walleye/john-c/5421/notes.025    Last update: March 18, 2000