December 17, 2003                                           page 1 of 6

 SPH 5421 Final Examination         Name: ________________________________________
 =================================================================================

 1.  The random variable X has the distribution specified by:

         prob(X = n) = 1 / 2^n,

     where n = 1, 2, 3, ....

     Write a SAS program (not using any SAS procedures) which

     (1) Generates 100 independent observations
         from this distribution.

     (2) Computes the mean of the observations

     (3) Computes the variance of the observations.


[25]

options linesize = 80 ;

    data geom ;

    n = 100 ;
    sum = 0 ;
    sumsq = 0 ;

    do i = 1 to n ;

       r = ranuni(-1) ;
       sumprob = 0 ;
       m = 1 ;

       do j = 1 to 100 ;

          sumprob = sumprob + 1 / 2**m ;
          if r < sumprob then goto jump1 ;
          m = m + 1 ;

       end ;

    jump1:

      sum = sum + m ;
      sumsq = sumsq + m*m ;
      output ;

  end ;

     mean = sum / n ;
     var  = (sumsq - n*mean*mean)/(n - 1) ;
     output ;

 run ;

 proc print ;


 December 17, 2003                                           page 2 of 6

 SPH 5421 Final Examination         Name: ________________________________________
 =================================================================================

 2.  Let X1, X2, ..., Xn be a sample of observations of the random
     variable X.

     Define the LOWEST TERTILE to be the [n/3] smallest values in the
     sample, where [n/3] is the largest integer less than or equal
     to n/3.

     Define the HIGHEST TERTILE to be the [n/3] largest values.

     Define the "1/3 trimmed mean" to be the mean of the sample after
     the lowest tertile and highest tertile are thrown out.

     Write a SAS macro to compute the 1/3 trimmed mean of a sample.
     The call to the macro should look like the following:

     %trim3 (dataset, n, x, tmean),   where

     dataset = a data set that includes the values for x
     n       = number of observations in the dataset (you can assume
               none are missing)
     x       = the variable of interest
     tmean   = output trimmed mean.


[25]


%macro trim3 (dataset, n, x, tmean) ;

 proc sort data = &dataset ; by &x ;

 data xsort ;
      retain xobs 0 ;
      set &dataset ;

      xobs = xobs + 1 ;

 run ;

 proc means data = xsort n mean std ;
      where xobs gt int(%n / 3) and xobs le int(2 * &n / 3) ;
      var &x ;
      output out = xmean
             &tmean = mean ;
 run ;

 %mend ;


 December 17, 2003                                           page 3 of 6

 SPH 5421 Final Examination         Name: ________________________________________
 =================================================================================

 3.  A datafile has the following structure:

     OBS       ID       X
    -----     ----     ---
      1         1       16
      2         1       15
      3         1       18
      4         2        4
      5         2        7
      6         2        2
      7         3       X7
      8         3       X8
      9         3       X9
     10         4       X10
     11         4       X11
     12         4       X12
     13         5       X13
     14         5       X14
     14         5       X15

              etc.

    That is, there are 3 consecutive observations for each ID.

    Write a SAS program which reads in this datafile and writes
    out another datafile which has the following structure:

    OBS        ID       R     S     T
   -----      ----     ---   ---   ---
     1          1       X1    X2    X3
     2          2       X4    X5    X6
     3          3       X7    X8    X9

                  etc.

[25]

  data xobs ;
       retain casecount 0 x1 x2 ;
       input ID x ;
       casecount = casecount + 1 ;
       if casecount = 1 then x1 = x ;
       if casecount = 2 then x2 = x ;
       if casecount = 3 then do ;

          x3 = x ;
          output ;
          casecount = 0 ;

       end ;

  run ;

  proc print ;

  endsas ;

 December 17, 2003                                           page 4 of 6

 SPH 5421 Final Examination         Name: ________________________________________
 =================================================================================

 4.  A program produces maximum likelihood estimates  s  and t  of
     two parameters S and T.  It also produces a covariance matrix
     A for s and t:

               |  .02    -.01 |
           A = |              |
               | -.01     .03 |.

     Find an estimated standard error of  r = s^2 + 3 * s * t.


[25]

    var(r) = (approx) (dr/ds)^2 * var(s) + 2*(dr/ds) * (dt/ds) * cov(s, t)
              + (dr/dt)^2 * var(t)

            = (2*s + 3*t)^2*(.02) + 2*(2*s + 3*t)*(3*s)*(-.01) + (3*s)^2 * (.03)



 December 17, 2003                                           page 5 of 6

 SPH 5421 Final Examination         Name: ________________________________________
 =================================================================================
 5.  Levels of cortisol in a person's blood tend to vary according to
     the time of day that the blood is drawn.  Here is a graph of
     cortisol levels for individuals, plotted against time of day
     on a 24-hour clock:

     |
 .20 |                             xxx xxx
     |                        xxxxx   x   xxx
     |                       x               xx x
 .15 |                     x                   x xx
     |                    x                       xxx  x
     |x              xxxxx                           xx xxxx
 .10 | x x         xx
     |    x       x
     |     xx  xxx
 .05 |_______xx___________________________________________
     0        4        8       12       16       20       24  time t


     A reasonable model for the expected cortisol level might be:

       E(C(t)) = a + b * cos(c + d*pi*t),

     where time t is in hours.


     a)  Describe what the parameters are in terms of the graph.

     b)  Specify further assumptions which are needed to justify using
         a least-squares procedures to obtain estimates of the para-
         meters a, b, c, and d.

     c)  What would good initial guesses be for parameters a, b,
         c, and d ?

     d)  Write a PROC NLIN program which produces least-squares
         estimates of the parameters.



[25]

     a):   a = overall mean
           b = amplitude
           c = phase offset
           d = frequency

     b):   error ~ N(0, sig^2).

     c):  a = .13, b = .08, c = 3.8, d = 3.8 = 4*pi/3 ;

     d):

         proc nlin method = marquardt ;
              pars a = .13
                   b = .08
                   c = 3.8
                   d = .08 ;

              der.a = 1 ;
              der.b = cos(c + d * pi * t) ;
              der.c = -b * sin(c + d * pi * t) ;
              der.d = - b*pi*t*sin(c + d * pi * t) ;

              f = a + b * cos(c + d * pi * t) ;

              model y = f ;

    run ;

    endsas ;



 December 17, 2003                                           page 6 of 6

 SPH 5421 Final Examination         Name: ________________________________________
 =================================================================================

 6.  Short answers:

     1)  What is an eigenvector?

         Given an n x n  matrix A, an eigenvector v is an n x 1 column vector
         such that A * v = a * v for some nonzero constant a.



     2)  What is an advantage of the simplex method of computing
         a minimum of a function ?

         Usually converges, and does not need expressions for derivatives.




     3)  What is a disadvantage of the simplex method?

         Slow to converge, does not automatically give an estimate of variance.





     4)  Suppose f(x) = 5*x - exp(x).  You can find a solution
         to f(x) = 0 by the use of Newton's method.  The key
         equation is

                 x(n + 1) = x(n)  -   ??? / ???.

                 x(n + 1) = x(n) - f(x) / f'(x)
                          = x(n) - (t*x - exp(x)) / (5 - exp(x)) ;

     5)  If X has a Poisson distribution with parameter h = 1,
         give the probabilities that:

         X = 0 : h^0 * exp(-h) / 0! = 1/e

         X = 1 : h^1 * exp(-h) / 1! = 1/e

         X = 2 : h^2 * exp(-h)/2! = 1 / (2 * e) ;

[25]