A System of SAS Macros for Producing Statistical Reports

 

Greg Grandits, M.S.

Ken Svendsen, M.S.

Division of Biostatistics, University of Minnesota

 

 

Abstract

 

Monitoring clinical trials requires periodic generation of statistical reports for Data and Safety Monitoring Board (DSMB) reviews and other purposes.  These reports display comparisons between treatment groups for specific outcomes and usually consist of summary statistics for each treatment group and assessment of statistical significance.  The summary statistics can be simple descriptive summaries (counts, means, SD’s, etc.) or summaries from more complicated statistical analyses (e.g. hazard ratios and confidence intervals from proportional hazards regression analyses) and often include a combination of both.   Existing SAS report procedures are adequate to produce a report with simple summary statistics but are not able to produce summaries from more complicated analyses, nor display significance levels.  However, with procedure output datasets, the output delivery system (ODS), and the data-step, SAS has the necessary tools to produce such a report.  This paper describes a system of macros that make use of these tools that allow the easy generation of statistical reports.  With use of these macros information that comes from several SAS procedures can be placed onto a single report page.   These macros have been used extensively by the Division of Biostatistics to produce DSMB reports for the dozens of ongoing clinical trials and observational studies.

 

 

Introduction

 

In clinical trials periodic reports are generated to monitor study progress and to compare treatment for relevant variables of interest.  Often these reports are presented to Data and Safety Monitoring Board (DSMB) groups.  These reports can be varied, but usually contain summary statistics of variables (counts, means, SD’s, etc.) but often also contain other statistical information such as p-values, Z-statistics, hazard ratios, and confidence intervals.  It is important then for the statistical coordinating center responsible for producing these reports to be able to do so using methods that do not require transcription of numbers, typing, or editing of computer output.  SAS reporting procedures, in general, are inadequate to meet these needs.  However, SAS capabilities in the data-step, the ability of SAS to output datasets from procedures, and the advent of the output delivery system (ODS) provides the tools from which macros can be developed to accomplish this task.  This paper describes a system of macros that produce customized statistical reports that are easy to program and modify, and give complete flexibility to placement of text and data values onto the report page.

 

The user first defines columns across the report page.  Text or data values (summary statistics) are then moved to these columns and specified lines using macros MOVE and NMOVE.  Summary statistics are available from calling a macro which runs a procedure (GLM, PHREG, etc.), outputs statistics to a SAS dataset, and then compresses the statistics into a one observation dataset.  Statistics are placed into array type names which can be moved to the report page after a SET statement.  Example: %nmove(p1-p8, col=7, line=12L8) moves 8 p-values to the 7th defined column starting on line 12. 

 

The description and use of the macros for moving text and data values to the report page have been given in an earlier SUGI paper (1).  These are briefly outlined here and followed by a description of the statistical macros that make available information from several procedures, which can be placed on the report page using the earlier developed macros.  These system of macros provide a comprehensive package for generating statistical reports for a variety of research applications.

 

 

STEPS IN WRITING REPORT

 

Report programs are made up of the following statements:

 

  1. %REPORT statement that indicates a new report is starting

 

  1. %COLSET statement that defines columns and column widths across the report page.

 

  1. %MOVE statements which move text to the report.  Features include centering, underlining, and repeating text.

 

  1. SET statement(s) that read in statistical information from a SAS dataset.  These one observation datasets will have been generated from one of the statistical generating macros.

 

  1. %NMOVE statements which place the statistical information to the report page.

 

 

SYNTAX FOR REPORT MACROS

 

1.      %REPORT is used simply as %REPORT which indicates the start of a new report.

 

2.      %COLSET is used as follows:

 

%COLSET (column1 size  column2 size … )

 

                        Example: %COLSET (25 20 2x 10 2x 10)

 

This statement sets up 4 columns.  The first column is 25 positions long and the last 3 columns are 10 positions long.  Two spaces are placed between the last 3 columns.  This is used to set off text from other text.

 

3.      %MOVE is used as follows:

 

%MOVE (‘string 1’:’string 2’:…, line=, col=, center=, under=)

 

This is best illustrated by examples.

 

‘Men’:’Women’:’Total’       text strings to be placed on report

 

line = 12 21 33                   moves strings to lines 12, 21, and 33

line = 12L3                         moves strings to lines 12, 13, and 14

 

col = 3 4 8                           moves strings to defined columns 3, 4, and 8

col = 2-3 4-5 6-7                            moves strings to columns formed by combining columns 2-3, 4-5, and 6-7

col = 2.4                              moves strings to columns 2 through 4

 

Example: %MOVE (‘Men’:’Women’:’Total’, col=1, line=10L3)

 

 

4.      %NMOVE is used as follows:

 

%NMOVE (var1-var(n), line=, col=, fmt=);

 

The line and col parameters are identical to those in %MOVE.  The fmt parameter formats the values.

 

Example: %NMOVE (m1-m20, col = 2 3, line=12L10, fmt=6.2)

 

This statement would move the values of m1 through m20 to columns 2 and 3, and from lines 12 through 21.

 

STATISTIC GENERATING MACROS

 

Below is a listing of some of the statistical generating macros with the SAS procedure that is called, the statistics that are available, and a brief description of the macro.

 

MACRO

PROCEDURE

STATISTICS

DESCRIPTION

Breakdn

Summary

N, mean, SD, etc.

Summary statistics by level of class variable

Freqdis

Summary

Counts, percents, cumulative percents

Distribution of variable by level of another variable

 

Glmp

Glm

ANOVA p-values

Statistics from analysis of variance

Regp

Reg

p-values, betas, t-stat, etc.

Statistics from linear regression

Phregp

Phreg

Betas, SEs, RRs, CIs, p-values, etc.

Statistics from cox regression

Logistp

Logist

Betas, SE, ORs, CIs, p-values, etc.

Statistics from logistic regression

 

Chisqp

Freq

CMH p-values

Stratified contingengy table analyses

Several others are also available, and others can be added as needed.

 

 

For illustration, two of the macros, %BREAKDN and %PHREGP,  are described in more detail.  Other macros have similar syntax.

 

 

%BREAKDN ( data=, class=, var=, out=, sfirst= );

 

This macro reads the SAS dataset specified in DATA using PROC SUMMARY and computes summary statistics for each variable specified in VAR by each level of the variable specified in CLASS.  A one observation dataset containing these statistics is written to the dataset specified in OUT.

 

The statistics calculated are N, MEAN, MEDIAN, SDEV, SE, SUM, MIN, and MAX.  They are contained in the variables N1-N?, M1-M?, MED1-MED?, S1-S?,  SE1-SE?, SUM1-SUM?, MIN1-MIN?, and MAX1-MAX?, where ? depends on the number of variables in VAR and the number of levels in the variables in CLASS.

 

 

The parameters specified are illustrated by examples.

 

Parameter/Value                 Description

 

class = sex 2T                       statistics are stored for both levels of the variable SEX and the total

 

class = sex 2 group 6           statistics are stored for each level of SEX crossed with GROUP.

 

var = age dbp sbp chol         This is the list of variables to compute statistics .

 

 

DATA is the SAS dataset to be read; OUT is the SAS dataset statistics are written to and contains one observation, and SFIRST indicates the order in which the statistics are stored.

 

Example:

 

%BREAKDN (class = sex 2T, var = age dbp sbp chol, out = table1, sfirst=VAR)

 

The n's for the 4 variables where sex = 1 are stored in n1-n4.

The n's for the 4 variables where sex = 2 are stored in n5-n8.

The n's for the 4 variables for women and women combined are stored in n9-n12.

 

The variables are similarly stored for the other statistics.

 

%PHREGP ( parameters)

 

PHREGP runs PROC PHREG to perform proportional hazards regression and saves results for factors of interest into SAS datasets.

 

Parameter                 Description

 

data =                         SAS dataset to be read

 

dlist =                          Dependent variable list.  An analysis is done for each variable listed.  The variables are event indicators coded as 1 if event, 0 if censored.

 

ilist =                           Independent variable list.  Used for each dependent variable given in dlist.

 

tlist =                           Failure or censoring time list corresponding to events in dlist.

 

factor=                        Independent variable (s) for which statistics are obtained.

 

units =                         Value regression coefficients are multiplied by before relative risks are computed.

 

strata =                       Optional list if strata variables.

 

out =                            SAS dataset(s) to which statistics are written.

 

The statistics and the names of the variables that contain them are as follows:

 

e1-e?                          regression coefficients for factor

se1-se?                      standard errors of coefficients

z1-z?                           z-statistics for factor

p1-p?                          p-values for factor

rr1-rr?                         relative risks for factor

u1-u?                          upper 95% CI for RR

l1-l?                             lower 95% CI for RR

 


 

Discussion

 

 

Much effort has been made by SAS and SAS users to make reporting easier.  Although no individual SAS product or procedure is sufficient to provide the ease or flexibility in producing statistical reports, with use of the system of macros described, which takes advantage of the data step, output form procedures, and the ODS, a very useful reporting system can be developed.  The statistical report macros described here are simple to use and have tremendous flexibility.  Programs are easy to write, understand, and modify.  Typical report programs are less than one page.  These macros can also be expanded to include statistics from other procedures through use of ODS. 

 

The key to making the numeric moves in the data step is getting the statistics into one observation datasets.  Then a single SET statement is all that is needed to make available the statistics, without worry of the implied loop in the data step.  Information from several different sources can easily be included on the report by multiple SET/NMOVE statements.  This gives the flexibility to the system.  These macros have proved invaluable to the clinical trials and other projects monitored by the Division of Biostatistics at the University of Minnesota, and could be useful for any research organization or pharmaceutical company producing statistical reports for clinical trials.

 

 

Contact Information

 

Greg Grandits

Division of Biostatistics

2221 University Ave. SE

Suite 200

Minneapolis, MN 55414

Email: grand001@umn.edu

Phone: 612-626-9033

Fax: 612-624-3584:

 


Example Program

 

 

* Assume dataset mort contains all needed variables ;

 

%let phlist =     xcvd xchd xami xochd xcd xhhd xoh xcv xcvsub xcvint             xcvoth xothcvd ;

 

%breakdn(data=mort,class=group 2,var = &phlist,out=m);

 

%phregp(data=mort, dlist=&phlist, ilist=trt,

        tlist = t t t t t t t t t t t t ,

        factor= trt, strata=clinic, out=rr) ;

 

%report;

%colset (32 9 9 2x 9 9 9 2x 9);

 

%move('Cause of Death By Treatment Group in Study X', col=1-0,line=3);

%move('Cause of Death', col=1, center=n, line=7, u=y);

%move('All cardiovascular':'  CHD':'    Acute MI':'    Other CHD':

      '  Cardiac dysrhythmias':

      '  Hypertensive heart disease':'  Other hypertensive':

      '  Cerebrovascular':'    Subarachnoid hemorrhage':

      '    Intracerebral hemorrhage':

      '    Other cerebrovascular':'  Other cardiovascular':

      col=1, center=n, line=9L12 );

%move('Events in Group', col=2-3, line=6);

%move('Hazard',col=4,line=6) ;

%move('A':'B':'Ratio':'95% LB':'95% UB':'P-value',

      col=2.0, line=7, u=y);

 

set m;

%nmove(sum1-sum24, col=2 3, line=9L12, fmt=5.0);

 

set rr ;

%nmove(r1-r12,col=4,line=9L12,fmt=5.2) ;

%nmove(l1-l12,col=5,fmt=5.2) ;

%nmove(u1-u12,col=6,fmt=5.2) ;

%nmove(p1-p12,col=7,fmt=5.3) ;

 

run ;

 


Output

 

 

                       Cause of Death By Treatment Group in Study X

 

 

                                 Events in Group     Hazard

Cause of Death                      A        B        Ratio   95% LB   95% UB     P-value

-------------------------------  -------  -------    -------  -------  -------    -------

All cardiovascular                 1114     1160       0.96     0.88     1.04      0.302

  CHD                               767      827       0.93     0.84     1.02      0.121

    Acute MI                        338      397       0.85     0.73     0.98      0.024

    Other CHD                       429      430       1.00     0.87     1.14      0.985

  Cardiac dysrhythmias               35       35       1.00     0.62     1.60      0.995

  Hypertensive heart disease         28       30       0.92     0.55     1.54      0.749

  Other hypertensive                 13       11       1.16     0.52     2.60      0.712

  Cerebrovascular                   107      105       1.02     0.78     1.33      0.887

    Subarachnoid hemorrhage           8       13       0.62     0.26     1.50      0.287

    Intracerebral hemorrhage         22       24       0.92     0.51     1.64      0.772

    Other cerebrovascular            77       68       1.13     0.82     1.57      0.457

  Other cardiovascular              164      152       1.07     0.86     1.34      0.526