SAS MACROS                                               SPH 5421 notes.022

     For this segment of the course, a fairly good reference is :

     SAS Macro Language, First Edition (1997).  SAS Institute, Inc.,
     Cary, NC.

     Another reference that you may find useful is:

     A. Carpenter: Carpenter's Complete Guide to the SAS Macro Language (1998),
     SAS Institute, Inc., Cary NC.

     Neither of these books tells everything you might want to know, especially with
respect to statistical applications.  As with most of SAS, the best way to learn the
topic is by studying and imitating examples.


SAS Macro Variables

     Most SAS variables are defined within DATA steps or procedures.  They are not
defined 'globally' for the whole program.  Macro variables are character variables
which can be defined in a data step and used elsewhere in subsequent parts of the
program.  Consider the following example:

================================================================================

options linesize = 80 ;
footnote "program: /home/walleye/john-c/5421/macro1.sas &sysdate &systime" ;

%let author = Enola Malone ;
%let dataset = atest ;
%let varlist = x y z ;

data atest ;
     input x y z w ;

     cards ;
     1  3  5  10
     2  4  7  15
     3  5  9  20
     4  6 10  25
     5  7 11  30
     ;
 run ;

 proc print data = &dataset ;
      var &varlist ;
 title1 "Listing of &dataset for &author" ;

 proc means n mean stddev min max data = &dataset ;
      var &varlist ;
 title1 "Basic descriptive stats of &dataset for &author" ;
========================================================================


                       Listing of atest for Enola Malone                       1
                                                 17:14 Sunday, February 27, 2000

                              OBS    X    Y     Z

                               1     1    3     5
                               2     2    4     7
                               3     3    5     9
                               4     4    6    10
                               5     5    7    11


          program: /home/walleye/john-c/5421/macro1.sas 27FEB00 17:14

                Basic descriptive stats of atest for Enola Malone               2
                                                 17:14 Sunday, February 27, 2000

      Variable  N          Mean       Std Dev       Minimum       Maximum
      -------------------------------------------------------------------
      X         5     3.0000000     1.5811388     1.0000000     5.0000000
      Y         5     5.0000000     1.5811388     3.0000000     7.0000000
      Z         5     8.4000000     2.4083189     5.0000000    11.0000000
      -------------------------------------------------------------------
 
          program: /home/walleye/john-c/5421/macro1.sas 27FEB00 17:14

========================================================================

    There are several things to note in this example:

    1.  Three macro variables are defined: author, dataset, and
        varlist.  These are all character or string variables.

    2.  The three macro variables are all defined with %let
        statements.

    3.  Note that in the definitions, the macro variables are not
        preceded by & signs.  However, they ARE preceded by & signs
        when they are referenced later in the program.

    4.  Note that the %let statements occur outside of any data step.

    5.  Note that the macro variable  varlist  actually includes 3 variables
        from the dataset  atest:  x, y, and z.  In fact the actual content
        of this macro variable is the string 'x y z'.  The way macro variables
        behave is to substitute the string into SAS code, and then execute
        the SAS code as if it were written that way in the first place.

    6.  Note that when macro variables are used in a title, the title
        must be surrounded by double quotes.

    7.  Note that two other macro functions are used in this program
        also: in the footnote line, &sysdate and &systime are
        referenced.  These are system functions that cause the system
        date and time to be displayed.


    The SAS Macros manual, Chapter 3 (pages 21-32) is a good reference for
macro variables.  There are a number of SAS system macros in addition to
&sysdate and &systime: see pages 22-23 for a list.  For example, &sysday tells
you the current day of the  week.  It is unlikely you will want to use many of
these, but some are useful.

     See also an example of use of macro variables on pages 24.3-24.4 of the
notes (the program on graphing the corrected approximation to the binomial).

    Most of the time you will be using macro variables that you have created. In the
example below, the program reads an external datafile with the assumption that the
length of the datafile is not known in advance.  When the end of the datafile is
encountered, a macro variable is created which stores the length of the datafile.
The next procedure is a proc print which makes use of the macro variable to print
either 5% or 20 lines of the file, whichever is larger.

==================================================================================

 OPTIONS  LINESIZE = 80 ;

 footnote "program: /home/walleye/john-c/macro2.sas &sysdate &systime" ;

 DATA lhs ;
      infile '/home/walleye/john-c/5421/lhs.data' EOF = ENDOFILE ;
      RETAIN OBSCOUNT 0 ;

      INPUT CASENUM  AGE GENDER BASECIGS GROUP RANDDATE DEADDATE DEADCODE
            BODYMASS F31MSTAT
            VPCQUIT1 VPCQUIT2 VPCQUIT3  VPCQUIT4 VPCQUIT5
            CIGSA0   CIGSA1   CIGSA2    CIGSA3   CIGSA4   CIGSA5
            S1MFEV   S2FEVPRE  A1FEVPRE  A2FEVPRE A3FEVPRE A4FEVPRE A5FEVPRE
                     S2FEVPOS  A1FEVPOS  A2FEVPOS A3FEVPOS A4FEVPOS A5FEVPOS
                     WEIGHT0   WEIGHT1   WEIGHT2  WEIGHT3  WEIGHT4  WEIGHT5 ;


            OBSCOUNT = OBSCOUNT + 1 ;

            RETURN ;


 ENDOFILE:

 MOBS0520 = MAX(.05 * OBSCOUNT, 20) ;
 CALL SYMPUT('OBSLIM', TRIM(LEFT(MOBS0520))) ;

 RUN ;

*======================================================================;

 PROC PRINT ;
      WHERE OBSCOUNT LE &OBSLIM ;
      VAR   OBSCOUNT AGE GENDER BASECIGS DEADCODE BODYMASS ;

 TITLE1 'USE OF MACRO VARIABLES:' ;
 TITLE2
 'TEST OF A PROGRAM WHICH PRINTS EITHER THE FIRST 20 OBSERVATIONS' ;
 TITLE3
 'ON THE FILE, OR THE FIRST 5% OF THE OBSERVATIONS, WHICHEVER IS LARGER';
 TITLE4 "THE LIMIT OF OBSERVATIONS TO PRINT HERE IS: &OBSLIM" ;
 ENDSAS ;

==================================================================================

                            USE OF MACRO VARIABLES:                            1
        TEST OF A PROGRAM WHICH PRINTS EITHER THE FIRST 20 OBSERVATIONS
     ON THE FILE, OR THE FIRST 5% OF THE OBSERVATIONS, WHICHEVER IS LARGER
                 THE LIMIT OF OBSERVATIONS TO PRINT HERE IS: 25
                                                 17:17 Monday, February 28, 2000

      OBS    OBSCOUNT    AGE    GENDER    BASECIGS    DEADCODE    BODYMASS

        1        1        51       0         20           .         26.7  
        2        2        45       0         60           .         25.3  
        3        3        44       0         40           .         31.8  
        4        4        54       0         40           1         21.7  
        5        5        47       0         35           .         29.0  
        6        6        55       0         40           .         20.7  
        7        7        53       0         30           .         28.9  
        8        8        54       0         40           .         26.5  
        9        9        46       1         40           .         22.1  
       10       10        47       0         35           .         30.5  
       11       11        54       0         30           .         29.9  
       12       12        59       0         30           .         20.0  
       13       13        54       0         30           .         25.2  
       14       14        50       0         40           .         23.9  
       15       15        54       0         40           .         29.5  
       16       16        50       0         30           .         28.1  
       17       17        58       0         35           .         23.4  
       18       18        53       0         30           .         21.8  
       19       19        54       0         40           .         22.2  
       20       20        59       0         30           .         26.7  
       21       21        50       0         20           .         29.4  
       22       22        44       0         45           .         25.2  
       23       23        56       0         20           .         25.1  
       24       24        56       1         20           .         33.4  
       25       25        57       1         30           .         33.9  
 
 
             program: /home/walleye/john-c/macro2.sas 28FEB00 17:17
==================================================================================

     There are some points you might want to  note in this program:

     1.  When the end of file is encountered, datastep execution is transferred
         to the label 'ENDOFILE'.  The section of code under ENDOFILE is
         not executed until the end of file is reached, because of the
         'RETURN' statement (which sends program execution back up to the top of
         the datastep for all observations which precede end of file).

     2.  The variable OBSCOUNT is initialized to 0 in a RETAIN statement.
         It is then incremented by 1 for each subsequent statement.

     3.  The SYMPUT subroutine is called to put the appropriate value in
         the macro variable &OBSLIM.  SYMPUT makes it possible to create
         macro variables which are data dependent and which can be created
         in a datastep and then used later in the program in other datasteps
         or in procedures.  In this case it is used in the procedure
         PROC PRINT.

     SYMPUT is a very useful routine to know about.  There is an example using
it on page 27 of the SAS Macro Manual, and extensive documentation on pages
226-229.  There are some important limitations to using it.  One is that a
macro variable created using SYMPUT will not be available for use until AFTER
completion of the datastep in which it was created.

     You might wonder why, instead of using SYMPUT, one might not instead write
the following:

     %LET OBSLIM = MOBS0520 ;

     If you try this, you will find that OBSLIM will be a character variable
which equals the string 'MOBS0520'.  This is not what you want in the PROC
PRINT statement that follows.  You want the VALUE of MOBS0520, not the name
of the variable.  This is a subtle and confusing point about macro variables.
A usual use of macro variables, as in the program  macro1.sas, is to simply
provide a way of substituting in variable names, not the values of the variables.
That is why SYMPUT is so useful: it makes it possible to substitute the VALUES
of variables in the places where you want them rather than just variable names.

     Problem 22 below asks that you use computed macro variables in
conjunction with the PUT statement.  You should look at the SAS Language
manual for a description of, and examples of, PUT statements.
Here is a short example that uses PUT statements to write text to a
file (but without macro variables).  Note that the name of the output
file is 'outstuff':

========================================================================

options linesize = 80 ;

footnote "prog: /home/walleye/john-c/5421/putexamp &sysdate &systime" ;

data putexamp ;

     file 'outstuff' ;

     put "This is the first line of the file 'outstuff'. " ;
     put " " ;
     put " Today is Thursday, March 2." ;
     put " The sun is shining." ;
     put " It is about 40 degrees Fahrenheit" ;

     x = 13 / 3 ;
     put " " ;
     put " If you divide 13 by 3, you get: "  x  ;

     put " " ;
     put " This is the end of the output file." ;

run ;

endsas ;

========================================================================

This is the first line of the file 'outstuff'. 
 
 Today is Thursday, March 2.
 The sun is shining.
 It is about 40 degrees Fahrenheit
 
 If you divide 13 by 3, you get: 4.3333333333
 
 This is the end of the output file.

========================================================================

PROBLEM 22

Both of the following problems will require using an output file from
PROC MEANS or PROC SUMMARY.

1.  Write a program that reads the Lung Health Study data file,
    computes (using PROC MEANS or PROC SUMMARY) the mean values and standard
    deviations of AGE, GENDER, BODYMASS, and S2FEV1POS, and stores these
    statistics as macro variables.  Then in a later data step, construct PUT
    statements using these macro variables which create a paragraph of text that
    looks like the following:

    The mean age of LHS participants was  48.6  years (+/- 6.7);
    37%  were women.  The mean Body Mass Index was 25.5 kg/m2
    (+/- 2.5), and the mean Screen 2 post-BD FEV1 was
    2.88 (+/- 0.80).

2.  Write a program using the PUT statement and macro variables which will
    produce a table based on the LHS data with the following format:

    Date:  03-06-00

                Means and Standard Deviations of Selected Variables,
                      for LHS Participants, by Gender

                                    Men                      Women
                           --------------------     ------------------------
    Variable                N    Mean   Std Dev      N      Mean    Std Dev
    --------------------   ---  ------  -------     ---    ------   --------
    Age, yrs.              xxx   xx.x     xx.x      xxx     xx.x      xx.x
    Body Mass Index        xxx   xx.x     xx.x      xxx     xx.x      xx.x
    Cigs/Day at Baseline   xxx   xx.x     xx.x      xxx     xx.x      xx.x
    Cigs/Day at Year 1     xxx   xx.x     xx.x      xxx     xx.x      xx.x
    Cigs/Day at Year 2     xxx   xx.x     xx.x      xxx     xx.x      xx.x
    Cigs/Day at Year 3     xxx   xx.x     xx.x      xxx     xx.x      xx.x
    Cigs/Day at Year 4     xxx   xx.x     xx.x      xxx     xx.x      xx.x
    Cigs/Day at Year 5     xxx   xx.x     xx.x      xxx     xx.x      xx.x


==================================================================================

/home/walleye/john-c/5421/notes.022    Last update: March 14, 2000