n54703.018 SAS Macros - A Minimal Introduction A macro is a program which, given certain input, produces predictable (and presumably useful) output. The input is variable. Conceptually, a macro is not different from a subroutine. Macros are good for calculations which are often repeated but with differing input variables. The advantage of a macro is that you need to write it only once. You can then use it over and over again with varying input. Here is an example. You want to compute the harmonic mean of a variable on a datafile. The harmonic mean of a set of positive numbers is defined as the inverse of the average of the inverses of the numbers. A macro which computes this will have two inputs: 1. The name of the datafile. 2. The name of the variable. It will also have two outputs: 1. Then name of the variable that has the value of the harmonic mean of the input variable, and 2. The name of the output file which contains that variable. A program using a macro to compute the harmonic mean is given below - ================================================================================== options linesize = 80 MPRINT ; footnote "~john-c/5421/harmmean.sas &sysdate &systime" ; data adataset ; input height ; cards ; 31 37 44 33 33 17 16 10 5 28 ; run ; proc means data = adataset ; title 'Mean, std dev of height on adataset' ; run ; *----------------------------------------------------------------------; %macro harmonic(infile, x, xharm, outfile) ; data &outfile ; retain nxinv 0 sumxinv 0 ; start: set &infile end = eof ; nxinv = nxinv + 1 ; sumxinv = sumxinv + 1/&x ; if eof eq 1 then goto exit ; goto start ; exit: &xharm = 1/(sumxinv / nxinv) ; output ; run ; %mend ; *--------------------------------------------------------------------------------; %harmonic(adataset, height, hhmean, hhout) ; proc print data = hhout ; title 'Printout of the harmonic mean of numbers on adataset ... ' ; run ; The basic idea behind macros is quite simple: it is just a matter of substitution. The *call* to the macro is given by: %harmonic(afile, height, hhmean, hhout) ; There are four *parameters* inside the parenthesis: afile, height, hhmean, and hhout. Correspondingly, the first line of the macro is: %macro harmonic(infile, x, xharm, outfile) ; When you call the macro, the text of the macro is changed by making the following replacements: &infile --> afile &x --> height &xharm --> hhmean &outfile --> outfile With these substitutions, all of the computations that are inside the macro are carried out. Note that the first line of the program, options linesize = 80 MPRINT ; includes the word MPRINT. This causes SAS to print out on the log file what the macro does after all the substitutions are made. Here is what the log file looks like in this case: ================================================================================= 1 The SAS System 19:14 Sunday, May 2, 2004 NOTE: Copyright (c) 1989-1996 by SAS Institute Inc., Cary, NC, USA. NOTE: SAS (r) Proprietary Software Release 6.12 TS020 Licensed to UNIVERSITY OF MINNESOTA, Site 0001046017. This message is contained in the SAS news file, and is presented upon initialization. Edit the files "news" in the "misc/base" directory to display site-specific news and information in the program log. The command line option "-nonews" will prevent this display. NOTE: AUTOEXEC processing beginning; file is /net/sas612/autoexec.sas. NOTE: SAS initialization used: real time 1.640 seconds cpu time 0.056 seconds NOTE: AUTOEXEC processing completed. 1 options linesize = 80 MPRINT ; 2 footnote "~john-c/5421/harmmean.sas &sysdate &systime" ; 3 4 data adataset ; 5 6 input height ; 7 8 cards ; NOTE: The data set WORK.ADATASET has 10 observations and 1 variables. NOTE: DATA statement used: real time 0.500 seconds cpu time 0.011 seconds 19 ; 20 run ; 21 22 proc means data = adataset ; 23 title 'Mean, std dev of height on adataset' ; 24 run ; NOTE: The PROCEDURE MEANS printed page 1. NOTE: PROCEDURE MEANS used: real time 0.160 seconds cpu time 0.004 seconds 25 26 *----------------------------------------------------------------------; 27 %macro harmonic(infile, x, xharm, outfile) ; 28 29 data &outfile ; 30 retain nxinv 0 sumxinv 0 ; 2 The SAS System 19:14 Sunday, May 2, 2004 31 start: 32 set &infile end = eof ; 33 34 nxinv = nxinv + 1 ; 35 sumxinv = sumxinv + 1/&x ; 36 37 if eof eq 1 then goto exit ; 38 goto start ; 39 40 exit: 41 42 &xharm = 1/(sumxinv / nxinv) ; 43 44 output ; 45 46 run ; 47 48 %mend ; 49 *------------------------------------------------------------------------------- -; 50 51 %harmonic(adataset, height, hhmean, hhout) ; MPRINT(HARMONIC): DATA HHOUT ; MPRINT(HARMONIC): RETAIN NXINV 0 SUMXINV 0 ; MPRINT(HARMONIC): START: SET ADATASET END = EOF ; MPRINT(HARMONIC): NXINV = NXINV + 1 ; MPRINT(HARMONIC): SUMXINV = SUMXINV + 1/HEIGHT ; MPRINT(HARMONIC): IF EOF EQ 1 THEN GOTO EXIT ; MPRINT(HARMONIC): GOTO START ; MPRINT(HARMONIC): EXIT: HHMEAN = 1/(SUMXINV / NXINV) ; MPRINT(HARMONIC): OUTPUT ; MPRINT(HARMONIC): RUN ; NOTE: The data set WORK.HHOUT has 1 observations and 4 variables. NOTE: DATA statement used: real time 0.000 seconds cpu time 0.003 seconds 52 53 proc print data = hhout ; 54 title 'Printout of the harmonic mean of numbers on adataset ... ' ; 55 run ; NOTE: The PROCEDURE PRINT printed page 2. NOTE: PROCEDURE PRINT used: real time 0.130 seconds cpu time 0.005 seconds NOTE: The SAS System used: real time 3.010 seconds cpu time 0.090 seconds NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 ================================================================================= Here is the output from this program. The harmonic mean in this case is 16.6762. The ordinary mean is 25.400. ================================================================================= ------------------------------------------------------------------------ Mean, std dev of height on adataset 1 19:14 Sunday, May 2, 2004 Analysis Variable : HEIGHT N Mean Std Dev Minimum Maximum ---------------------------------------------------------- 10 25.4000000 12.6771886 5.0000000 44.0000000 ---------------------------------------------------------- ~john-c/5421/harmmean.sas 02MAY04 19:14 ------------------------------------------------------------------------ Printout of the harmonic mean of numbers on adataset ... 2 19:14 Sunday, May 2, 2004 OBS NXINV SUMXINV HEIGHT HHMEAN 1 10 0.59966 28 16.6762 ~john-c/5421/harmmean.sas 02MAY04 19:14 ================================================================================= The following is another example: a macro for which the input is an array of 5 numbers, and the output is another array in which the 5 numbers are sorted in ascending order: ================================================================================= options linesize = 80 ; *----------------------------------------------------------------------; %macro sort5(x, y) ; do ix = 1 to 5 ; &y(ix) = &x(ix) ; end ; do ix = 2 to 5 ; do jx = 1 to ix - 1 ; if &y(jx) gt &y(ix) then do ; temp = &y(jx) ; &y(jx) = &y(ix) ; &y(ix) = temp ; end ; end ; end ; %mend ; *--------------------------------------------------------------------------------; data fivenums ; array a(5) a1-a5 ; array b(5) b1-b5 ; a(1) = 12; a(2) = 9; a(3) = 10; a(4) = 1; a(5) = -11; %sort5(a, b) ; run ; proc print data = fivenums ; var a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 ; ================================================================================= Output from this program: The SAS System 1 21:02 Tuesday, April 27, 2004 OBS A1 A2 A3 A4 A5 B1 B2 B3 B4 B5 1 12 9 10 1 -11 -11 1 9 10 12 ================================================================================= Here are some important rules regarding macros: 1. In your SAS program, the macro must precede the call to the macro. The macro should not be located inside a DATA step or a SAS procedure. 2. The first line of the macro must look like: %macro mname(... ) ; 3. The last line of the macro must be: %mend ; 4. You have to be careful about naming variables *within* the macro. In the example just given above, ix and jx were used as variables. If the macro is called from inside a data step, and ix and jx happen to be the names of variables also in that data step, the macro will modify their values. This can have strange and unexpected effects on your program. In general, macros are frequently used by SAS programmers basically to avoid having to write the same section of code over and over again. Some people make extensive use of very short macros, while others write and use very long and complex macros that carry out complex statistical computations and analyses. A good reference for SAS macros is: SAS Macro Language Reference, First Edition. SAS Institute Inc., Cary, NC, 1997. The best way to learn something like SAS macros is to imitate, copy, modify, and expand upon examples by others. The preceding text gives both a number of small examples, and also a catalog of macro commands and functions. ================================================================================= PROBLEM 1: Write a SAS macro which computes a cardiovascular risk score, based on a person's age, height, weight, smoking status, systolic blood pressure, and serum cholesterol. Input to the macro: age, gender, height, weight, cigarettes per day, SBP, serum cholesterol. Output: risk score. How the risk score is computed: A = .02 * age for men A = .015 * age for women. B = +.01 if BMI < 18.5 B = -.01 if 18.5 <= BMI < 30 B = +.02 if 30 <= BMI C = 0 if nonsmoker C = .01 if smokes 1-20 cigs per day C = .02 if smokes 21 or more cigs per day D = .0010 * SBP for men D = .0007 * SBP for women E = .0006 * Serum Cholesterol Risk score = 1 / (1 + exp(4.0 - A - B - C - D - E)). Write the macro and test it with some realistic values for the input parameters (10 examples). ================================================================================= n54703.018 Last update: April 26, 2005.