The Output Delivery System in SAS n54703.018.5
SAS produces very utilitarian output. It is complete and well-labelled,
but it is not very efficient and ordinarily it is not something you might
want to include in a data report. Typically you need to extract the essential
parameters and tests by hand and type up the findings into a nicely formatted,
easily readable document. Worse yet, every time you run a program on an
updated file, you have to transcribe the results and do all the typing and
formatting over again. Plus this process can easily lead to errors.
Can you get SAS to produce nicer tables ? Can you write macros that
produce nicely formatted, readable tables that just contain the essentials
of what you want to say?
Here is an example. You have data in the form of a collection of
2 x 2 tables. You want to summarize the data efficiently and produce
a report which includes the Fisher Exact Test for association of the
row variable and the column variable. In other words, you want a little
report which looks like this:
----------------------------------------------------------------------------------
Date : 24APR05 Program: freqoutput.sas
Test for Association of
Exposure to Benzene with
Incidence of Cancer
Cancer cases exposed: 120
Cancer cases not exposed: 1148
Control cases exposed: 52
Control cases not exposed: 1055
Odds ratio of exposure,
cases vs. controls: 120*1055/(52*1148) = 2.120
Fisher exact test (two-sided) p-value: 7.515E-06
---------------------------------------------------------------------------------
You know that you can get all the statistics that you need for this table
from PROC FREQ. You could simply copy down the numbers and type the report.
This is tiresome, especially if you have to do it over and over again for
a variety of variables. It would be better to write a program to do it.
The question is, how might you extract the values computationally and produce
a table like that shown above ?
In Version 8 and later versions of SAS, most procedures have a feature that
was not present in earlier versions, called the 'Output Delivery System' (ODS). This
feature basically can create output files which contain parameter estimates,
descriptive statistics, and results of tests of hypotheses. Because these
statistics are stored in SAS datasets, you can access statistics and incorporate
the results into tables and graphs.
You can find out what ODS datasets a given procedure produces by referring
to the SAS Manual for that procedure or the SAS help pages. For example, for
PROC FREQ, you start with the SAS OnlineDoc, Vers 8, and go through the following
links:
SAS OnlineDoc --> SAS/STAT User's Guide --> The FREQ Procedure --> Details --> ODS Table Names
Here is how the ODS can be used to produce reports :
1. Specify which ODS dataset you want to use within the procedure you running.
2. After the procedure has run, examine that dataset.
3. Use the information in that dataset to produce a report, probably also
using the PUT statement.
Below is an example of the use of PROC FREQ to produce a report as described above:
=================================================================================
options linesize = 80 ;
footnote "~john-c/5421/freqoutput.sas &sysdate &systime" ;
data twoxtwo ;
input case exposed count ;
cards ;
1 1 120
1 0 1148
0 1 52
0 0 1055
;
run ;
ods trace on ;
proc freq data = twoxtwo ;
weight count ;
tables case * exposed / chisq ;
exact or ;
ods output CrossTabFreqs = freqs ;
ods output OddsRatioCL = odds ;
ods output FishersExact = fisher ;
title 'Example of the use of output statistics from PROC FREQ' ;
run ;
proc print data = freqs ;
title1 'Example of the use of output statistics from PROC FREQ' ;
title2 'Counts in the tables ...' ;
run ;
proc print data = fisher ;
title1 'Example of the use of output statistics from PROC FREQ' ;
title2 'Fisher exact test p-values ...' ;
run ;
proc print data = odds ;
title1 'Example of the use of output statistics from PROC FREQ' ;
title2 'Odds ratio confidence limits...' ;
run ;
options linesize = 160 ;
data fishodds ;
set freqs fisher odds ;
run ;
proc print data = fishodds ;
title1 'Print of combined output dataset ...' ;
run ;
data fishodds ; set fishodds ;
retain n00 n01 n10 n11 fisherpvalue2 oddsratio ;
file 'freqoutput.out' ;
if case eq 0 and exposed eq 0 then n00 = Frequency ;
if case eq 0 and exposed eq 1 then n01 = Frequency ;
if case eq 1 and exposed eq 0 then n10 = Frequency ;
if case eq 1 and exposed eq 1 then n11 = Frequency ;
if _n_ eq 15 then fisherpvalue2 = cValue1 ;
if _n_ eq 16 then oddsratio = cValue1 ;
if _n_ eq 24 then do ;
put " Date : &sysdate Program: freqoutput.sas" ;
put ' ' ;
put ' Test for Association of' ;
put ' Exposure to Benzene with' ;
put ' Incidence of Cancer' ;
put ' ' ;
put ' Cancer cases exposed : ' n11 4.0 ;
put ' Cancer cases not exposed : ' n10 4.0 ;
put ' Control cases exposed : ' n01 4.0 ;
put ' Control cases not exposed : ' n00 4.0 ;
put ' ' ;
put ' Odds ratio of exposure, ' ;
put ' cases vs. controls: ' n11 4.0 '*' n00 4.0 '/(' n10 4.0 '*' n01 4.0 ') = ' @ ;
put oddsratio 9.3 ;
put ' ' ;
put ' Fisher Exact Test (two-sided) p-value: ' fisherpvalue2 ;
end ;
run ;
endsas ;
=================================================================================
******* freqoutput.lst:
=================================================================================
Example of the use of output statistics from PROC FREQ 1
17:27 Sunday, April 24, 2005
The FREQ Procedure
Table of case by exposed
case exposed
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
0 | 1055 | 52 | 1107
| 44.42 | 2.19 | 46.61
| 95.30 | 4.70 |
| 47.89 | 30.23 |
---------+--------+--------+
1 | 1148 | 120 | 1268
| 48.34 | 5.05 | 53.39
| 90.54 | 9.46 |
| 52.11 | 69.77 |
---------+--------+--------+
Total 2203 172 2375
92.76 7.24 100.00
Statistics for Table of case by exposed
Statistic DF Value Prob
------------------------------------------------------
Chi-Square 1 19.9875 <.0001
Likelihood Ratio Chi-Square 1 20.6366 <.0001
Continuity Adj. Chi-Square 1 19.2842 <.0001
Mantel-Haenszel Chi-Square 1 19.9791 <.0001
Phi Coefficient 0.0917
Contingency Coefficient 0.0914
Cramer's V 0.0917
Fisher's Exact Test
----------------------------------
Cell (1,1) Frequency (F) 1055
Left-sided Pr <= F 1.0000
Right-sided Pr >= F 4.164E-06
Table Probability (P) 2.264E-06
Two-sided Pr <= P 7.515E-06
~john-c/5421/freqoutput.sas 24APR05 17:27
Example of the use of output statistics from PROC FREQ 2
17:27 Sunday, April 24, 2005
The FREQ Procedure
Statistics for Table of case by exposed
Estimates of the Relative Risk (Row1/Row2)
Type of Study Value 95% Confidence Limits
-----------------------------------------------------------------
Case-Control (Odds Ratio) 2.1207 1.5156 2.9675
Cohort (Col1 Risk) 1.0526 1.0297 1.0762
Cohort (Col2 Risk) 0.4964 0.3621 0.6803
Odds Ratio (Case-Control Study)
-----------------------------------
Odds Ratio 2.1207
Asymptotic Conf Limits
95% Lower Conf Limit 1.5156
95% Upper Conf Limit 2.9675
Exact Conf Limits
95% Lower Conf Limit 1.5014
95% Upper Conf Limit 3.0281
Sample Size = 2375
~john-c/5421/freqoutput.sas 24APR05 17:27
---------------------------------------------------------------------------------------------------
Example of the use of output statistics from PROC FREQ 3
Counts in the tables ...
17:27 Sunday, April 24, 2005
R C
F o o
r w l
e _ e P P P M
x _ T q e e e i
T p T A u r r r s
a c o Y B e c c c s
O b a s P L n e e e i
b l s e E E c n n n n
s e e d _ _ y t t t g
1 case_by_exposed 0 0 11 1 1055 44.421 95.3026 47.8892 .
2 case_by_exposed 0 1 11 1 52 2.189 4.6974 30.2326 .
3 case_by_exposed 0 . 10 1 1107 46.611 . . .
4 case_by_exposed 1 0 11 1 1148 48.337 90.5363 52.1108 .
5 case_by_exposed 1 1 11 1 120 5.053 9.4637 69.7674 .
6 case_by_exposed 1 . 10 1 1268 53.389 . . .
7 case_by_exposed . 0 01 1 2203 92.758 . . .
8 case_by_exposed . 1 01 1 172 7.242 . . .
9 case_by_exposed . . 00 1 2375 100.000 . . 0
~john-c/5421/freqoutput.sas 24APR05 17:27
---------------------------------------------------------------------------------------------------
Example of the use of output statistics from PROC FREQ 4
Fisher exact test p-values ...
17:27 Sunday, April 24, 2005
Obs Table Label1 cValue1 nValue1
1 case_by_exposed Cell (1,1) Frequency (F) 1055 1055.000000
2 case_by_exposed Left-sided Pr <= F 1.0000 0.999998
3 case_by_exposed Right-sided Pr >= F 4.164E-06 0.000004164
4 case_by_exposed .
5 case_by_exposed Table Probability (P) 2.264E-06 0.000002264
6 case_by_exposed Two-sided Pr <= P 7.515E-06 0.000007515
~john-c/5421/freqoutput.sas 24APR05 17:27
---------------------------------------------------------------------------------------------------
Example of the use of output statistics from PROC FREQ 5
Odds ratio confidence limits...
17:27 Sunday, April 24, 2005
c
Obs Table Label1 Value1 nValue1
1 case_by_exposed Odds Ratio 2.1207 2.120745
2 case_by_exposed .
3 case_by_exposed Asymptotic Conf Limits .
4 case_by_exposed 95% Lower Conf Limit 1.5156 1.515584
5 case_by_exposed 95% Upper Conf Limit 2.9675 2.967543
6 case_by_exposed .
7 case_by_exposed Exact Conf Limits .
8 case_by_exposed 95% Lower Conf Limit 1.5014 1.501429
9 case_by_exposed 95% Upper Conf Limit 3.0281 3.028071
~john-c/5421/freqoutput.sas 24APR05 17:27
---------------------------------------------------------------------------------------------------
Print of combined output dataset ... 17:27 Sunday, April 24, 2005 6
Row Col
Obs Table case exposed _TYPE_ _TABLE_ Frequency Percent Percent Percent Missing Label1 cValue1 nValue1
1 case_by_exposed 0 0 11 1 1055 44.421 95.3026 47.8892 . .
2 case_by_exposed 0 1 11 1 52 2.189 4.6974 30.2326 . .
3 case_by_exposed 0 . 10 1 1107 46.611 . . . .
4 case_by_exposed 1 0 11 1 1148 48.337 90.5363 52.1108 . .
5 case_by_exposed 1 1 11 1 120 5.053 9.4637 69.7674 . .
6 case_by_exposed 1 . 10 1 1268 53.389 . . . .
7 case_by_exposed . 0 01 1 2203 92.758 . . . .
8 case_by_exposed . 1 01 1 172 7.242 . . . .
9 case_by_exposed . . 00 1 2375 100.000 . . 0 .
10 case_by_exposed . . . . . . . . Cell (1,1) Frequency (F) 1055 1055.000000
11 case_by_exposed . . . . . . . . Left-sided Pr <= F 1.0000 0.999998
12 case_by_exposed . . . . . . . . Right-sided Pr >= F 4.164E-06 0.000004164
13 case_by_exposed . . . . . . . . .
14 case_by_exposed . . . . . . . . Table Probability (P) 2.264E-06 0.000002264
15 case_by_exposed . . . . . . . . Two-sided Pr <= P 7.515E-06 0.000007515
16 case_by_exposed . . . . . . . . Odds Ratio 2.1207 2.120745
17 case_by_exposed . . . . . . . . .
18 case_by_exposed . . . . . . . . Asymptotic Conf Limits .
19 case_by_exposed . . . . . . . . 95% Lower Conf Limit 1.5156 1.515584
20 case_by_exposed . . . . . . . . 95% Upper Conf Limit 2.9675 2.967543
21 case_by_exposed . . . . . . . . .
22 case_by_exposed . . . . . . . . Exact Conf Limits .
23 case_by_exposed . . . . . . . . 95% Lower Conf Limit 1.5014 1.501429
24 case_by_exposed . . . . . . . . 95% Upper Conf Limit 3.0281 3.028071
~john-c/5421/freqoutput.sas 24APR05 17:27
=================================================================================
Printout of the file 'freqoutput.out' :
=================================================================================
Date : 24APR05 Program: freqoutput.sas
Test for Association of
Exposure to Benzene with
Incidence of Cancer
Cancer cases exposed : 120
Cancer cases not exposed : 1148
Control cases exposed : 52
Control cases not exposed : 1055
Odds ratio of exposure,
cases vs. controls: 120*1055/(1148* 52) = 2.120
Fisher Exact Test (two-sided) p-value: 7.515E-06
=================================================================================
Notes:
1. Note the three 'ods' statements in proc freq:
---------------------------------------------------------------------------------
proc freq data = twoxtwo ;
weight count ;
tables case * exposed / chisq ;
exact or ;
ods output CrossTabFreqs = freqs ;
ods output OddsRatioCL = odds ;
ods output FishersExact = fisher ;
title 'Example of the use of output statistics from PROC FREQ' ;
run ;
---------------------------------------------------------------------------------
These create datasets from three of the 'Tables' that can be produced by
the ODS for proc freq. They are temporary SAS files which can be accessed
later in the program.
2. Note that I have used proc print for each of these datasets to find out
what variables are in it and what their format is.
3. Note that I concatenated all the ods datasets together into one dataset, and
printed the result:
---------------------------------------------------------------------------------
data fishodds ;
set freqs fisher odds ;
run ;
proc print data = fishodds ;
title1 'Print of combined output dataset ...' ;
run ;
---------------------------------------------------------------------------------
This enables me to tell which line in this combined file the statistics are
on which I want to print.
4. Note that in the data step with all the 'put' statements, I have used the
SAS observation counter, _n_, to identify the lines containing the desired
statistics.
5. I print all the results when I get to the last line in the file. This means
that I must collect the statistics that I want as I read through the previous
lines. These statistics are: n00, n01, n10, n11, fisherpvalue1, and oddsratio,
All of these are included in the 'retain' statement in this data step.
6. I use the 'put' statement extensively here. The output from the 'put'
statements goes onto a file called 'freqoutput.out', as specified in the 'file'
statement early in this data step.
7. Note that the 'put' statements specify formatted output. For example,
put x 6.2 ;
has the effect of printing x with 6 characters (counting the decimal point),
and 2 digits behind the decimal point.
=================================================================================
PROBLEM 1:
Use the Output Delivery System and "put" statements to produce a
report. In this case the objective is to summarize data on means,
standard deviations, and other statistics for a comparison of two
groups, with a quantitative outcome [blood pressure, for example].
The procedure that you will want to use for this is proc ttest.
----------------------------------------------------------------------
Comparison of Two Groups:
Outcome variable : Diastolic Blood Pressure
Sample Standard
Group Size Mean Deviation Range
----- ------ ------ --------- ---------------
A 112 85.6 10.0 (58, 104)
B 108 90.2 12.7 (62, 110)
----- ------ ------ --------- ---------------
Total 220 88.3 11.6 (58, 110)
Difference of Means : -4.6
Standard Error : 1.8
T statistic : -2.45
Degrees of freedom : 218
Two-sided p-value : .012
-----------------------------------------------------------------------
Illustrate the use of this program with a small dataset.
=================================================================================
n54703.018.5 Last update: April 26, 2005.