PubH 5470-3  Statistical Analysis Using SAS Procedures                         page 1 of 4

Exam 1 - March 24, 2004                                Name: _____________________________
==========================================================================================
1.  Given the following program, use the space below to show what the output will
    look like:
 ---------------------------------------------------------------------------------
    footnote "~fred/myprog.sas &sysdate &systime" ;

    data dset1 ;
         input id x y ;
    cards ;
    1  2.1  3.4
    2  3.2  4.5
    3  4.8  6.6
    ;
    run ;

    data dset2 ;
         input id y z ;
    cards ;
    1  8.8  9.9
    3  7.7  5.5
    4  1.0  2.0
    ;
    run ;

      data dset3 ;
           set dset1 dset2 ;
      run ;

      data dset4 ;
           merge dset1 dset2 ; by id ;

[12]   proc print data = dset3 ;
       title1 'PROC PRINT: data = dset3' ;

[13]   proc print data = dset4 ;
       title1 'PROC PRINT: data = dset4' ;

 endsas ;
 =================================================================================

                            PROC PRINT: data = dset3                           1
                                                  18:47 Thursday, March 25, 2004

                         OBS    ID     X      Y      Z

                          1      1    2.1    3.4     . 
                          2      2    3.2    4.5     . 
                          3      3    4.8    6.6     . 
                          4      1     .     8.8    9.9
                          5      3     .     7.7    5.5
                          6      4     .     1.0    2.0
 
                         ~fred/exam1.sas 25MAR04 18:47

                             PROC PRINT: data = dset4                           2
                                                  18:47 Thursday, March 25, 2004

                         OBS    ID     X      Y      Z

                          1      1    2.1    8.8    9.9
                          2      2    3.2    4.5     . 
                          3      3    4.8    7.7    5.5
                          4      4     .     1.0    2.0
 
 
                         ~fred/exam1.sas 25MAR04 18:47

 =================================================================================


PubH 5470-3  Statistical Analysis Using SAS PROCEDURES                         page 2 of 4

Exam 1 - March 24, 2004                                Name: _____________________________
==========================================================================================
2.  An experiment was conducted in which depressed people were each given one of 2
    drugs: A or B.  After one month of taking the drugs, each person was asked
    whether they felt less depressed than before they started taking the drugs.
    They could answer either Yes or No.

    The results were as follows:

      Drug A: 10 out of 50 people said 'Yes'.

      Drug B: 20 out of 50 people said 'Yes'.

    a)  Write a complete SAS program which uses PROC FREQ to analyzes this study.

 ---------------------------------------------------------------------------------
        data ab ;
             input drug outcome count ;
        cards ;
        A    Yes    10
[8]     A    No     40
        B    Yes    20
        B    No     30
        ;
        run ;

        proc freq data = ab ;
             weight count ;
             tables outcome * drug /chisq measures ;
        run ;
 ---------------------------------------------------------------------------------

    b)  What is the null hypothesis?  How is it tested by PROC FREQ?

        Null hypothesis is that the proportion who say 'Yes' is the
[4]     same in both groups.  Or, that the odds ratio is 1.


    c)  What option causes PROC FREQ to print an odds ratio ?

        'measures'  or 'CMH'
[2]


    d)  What will the odds ratio be in this case?


[5]     2.67 (for B vs A) or .375 (for A vs B)


    e)  What is the meaning of the odds ratio?

        Literally, it means that the odds of a Yes outcome in
[6]     the B group is 2.67 times as large as the odds of a Yes outcome
        in the A group.  An odds ratio greater than 1 means that
        the outcome of interest is more likely in the first group
        than in the second group.  Odds is defined as prob/(1 - prob).


PubH 5470-3  Statistical Analysis Using SAS PROCEDURES                         page 3 of 4

Exam 1 - March 24, 2004                                Name: _____________________________
==========================================================================================
3.  A datafile has the following structure:

              Diastolic                                    Coronary
    ID     Blood Pressure       Age        Diabetes        Artery Disease
    ----   --------------      -----      ----------      ---------------
     1          90               44           No                Yes
     2          68               71           Yes               Yes
     3          96               62           Yes               No
     4         100               80           No                Yes
     5         100               80           No                No

     etc       etc              etc          etc               etc

------------------------------------------------------------------------------------------
    The objective is to see whether diabetes predicts coronary artery disease.

    Age and blood pressure are known predictors of coronary artery disease.
There may be interactions of diabetes with blood pressure and age.

    a)  Write SAS code (below) which uses PROC LOGISTIC to analyze this dataset.

------------------------------------------------------------------------------------------
        data cad ;
             infile 'cad.data' ;
             input id  dbp   age   diabetes  cad ;

        ndiab = . ;  
          if diabetes = 'Yes' then ndiab = 1 ;
          if diabetes = 'No' then ndiab = 0 ;
        ncad = , ; 
          if cad eq 'Yes' then ncad = 1 ;
          if cad eq 'No' then ncad = 0 ;

        run ;

        proc logistic descending data = cad ;
[10]         model ncad = dbp age ndiab ;
            title1 'Model 1: dpb age diabetes' ;
        run ;
------------------------------------------------------------------------------------------

    b)  Suppose the coefficient of diabetes is 0.50, with a 95% confidence interval
        from 0.15 to 0.85.  How do you interpret this?

        It means that the odds of coronary artery disease for a person
[6]     who has diabetes is exp(.50) = 1.65 times the odds for a person who
        does not have diabetes, and that a 95% confidence interval for
        the true odds ratio is (1.16, 2.34).

    c)  How would you test for an interaction of diabetes with age?

        1.  Add a new variable to the dataset:

            agediab = age*ndiab ;
[9]
        2.  Add another proc logistic analysis:

            proc logistic descending data = cad ;
                 model ncad = dbp age ndiab agediab ;
            title1 'Model 2: dpb age diabetes and age * diab intxn' ;
            run ;

        3.  Compare difference in -2 Log L between the two models
            with a chi-square distrib with 1 degree of freedom.


PubH 5470-3  Statistical Analysis Using SAS PROCEDURES                         page 4 of 4

Exam 1 - March 24, 2004                                Name: _____________________________
==========================================================================================
4.  The printout below was generated by PROC REG.

    a)  Write the PROC REG code which could have produced this.

        proc reg data = dataset ;
             model y = x ;
[6]


    b)  Sketch a graph of the regression line.

      14.0|                *
          |              *
          |            *
[7]       |          *
          |        *   <----   Line has intercept 9.08, slope 1.4.
          |      *
          |    *
          |  *
          |*
       9.0----------------------
          0                3.6

    c)  Fill in the blanks below.  How many observations were there?

          Observations: 12.
[12]


------------------------------------------------------------------------------------------

Model: MODEL1  
Dependent Variable: Y                                                  

                              Analysis of Variance

                                 Sum of         Mean
        Source          DF      Squares       Square      F Value       Prob>F

        Model            1    ##  45.00     45.00000      ## 2.8125     0.1245
        Error           10    ## 160.00     16.00000
        C Total         11    ## 205.00

            Root MSE         ##  4.00     R-square  ## 45/205 = .2195

                              Parameter Estimates

                       Parameter      Standard    T for H0:               
      Variable  DF      Estimate         Error   Parameter=0    Prob > |T|

      INTERCEP   1      9.080000    1.25000000      ##  7.264       0.0001
      X          1      1.400000    0.25000000      ##  5.600       0.1245