December 17, 2003                                           page 1 of 6

 SPH 5421 Final Examination         Name: ________________________________________

 1.  The random variable X has the distribution specified by:

         prob(X = n) = 1 / 2^n,

     where n = 1, 2, 3, ....

     Write a SAS program (not using any SAS procedures) which

     (1) Generates 100 independent observations
         from this distribution.

     (2) Computes the mean of the observations

     (3) Computes the variance of the observations.


options linesize = 80 ;

    data geom ;

    n = 100 ;
    sum = 0 ;
    sumsq = 0 ;

    do i = 1 to n ;

       r = ranuni(-1) ;
       sumprob = 0 ;
       m = 1 ;

       do j = 1 to 100 ;

          sumprob = sumprob + 1 / 2**m ;
          if r < sumprob then goto jump1 ;
          m = m + 1 ;

       end ;


      sum = sum + m ;
      sumsq = sumsq + m*m ;
      output ;

  end ;

     mean = sum / n ;
     var  = (sumsq - n*mean*mean)/(n - 1) ;
     output ;
 run ;

 proc print ;

 December 17, 2003                                           page 2 of 6

 SPH 5421 Final Examination         Name: ________________________________________

 2.  Let X1, X2, ..., Xn be a sample of observations of the random
     variable X.

     Define the LOWEST TERTILE to be the [n/3] smallest values in the
     sample, where [n/3] is the largest integer less than or equal
     to n/3.

     Define the HIGHEST TERTILE to be the [n/3] largest values.

     Define the "1/3 trimmed mean" to be the mean of the sample after
     the lowest tertile and highest tertile are thrown out.

     Write a SAS macro to compute the 1/3 trimmed mean of a sample.
     The call to the macro should look like the following:

     %trim3 (dataset, n, x, tmean),   where

     dataset = a data set that includes the values for x
     n       = number of observations in the dataset (you can assume
               none are missing)
     x       = the variable of interest
     tmean   = output trimmed mean.


%macro trim3 (dataset, n, x, tmean) ;

 proc sort data = &dataset ; by &x ;

 data xsort ;
      retain xobs 0 ;
      set &dataset ;

      xobs = xobs + 1 ;

 run ;

 proc means data = xsort n mean std ;
      where xobs gt int(%n / 3) and xobs le int(2 * &n / 3) ;
      var &x ;
      output out = xmean
             &tmean = mean ;
 run ;

 %mend ;

 December 17, 2003                                           page 3 of 6

 SPH 5421 Final Examination         Name: ________________________________________

 3.  A datafile has the following structure:

     OBS       ID       X
    -----     ----     ---
      1         1       16
      2         1       15
      3         1       18
      4         2        4
      5         2        7
      6         2        2
      7         3       X7
      8         3       X8
      9         3       X9
     10         4       X10
     11         4       X11
     12         4       X12
     13         5       X13
     14         5       X14
     14         5       X15


    That is, there are 3 consecutive observations for each ID.

    Write a SAS program which reads in this datafile and writes
    out another datafile which has the following structure:

    OBS        ID       R     S     T
   -----      ----     ---   ---   ---
     1          1       X1    X2    X3
     2          2       X4    X5    X6
     3          3       X7    X8    X9



  data xobs ;
       retain casecount 0 x1 x2 ;
       input ID x ;
       casecount = casecount + 1 ;
       if casecount = 1 then x1 = x ;
       if casecount = 2 then x2 = x ;
       if casecount = 3 then do ;

          x3 = x ;
          output ;
          casecount = 0 ;

       end ;

  run ;

  proc print ;

  endsas ;

 December 17, 2003                                           page 4 of 6

 SPH 5421 Final Examination         Name: ________________________________________

 4.  A program produces maximum likelihood estimates  s  and t  of
     two parameters S and T.  It also produces a covariance matrix
     A for s and t:

               |  .02    -.01 |
           A = |              |
               | -.01     .03 |.

     Find an estimated standard error of  r = s^2 + 3 * s * t.


    var(r) = (approx) (dr/ds)^2 * var(s) + 2*(dr/ds) * (dt/ds) * cov(s, t)
              + (dr/dt)^2 * var(t)

            = (2*s + 3*t)^2*(.02) + 2*(2*s + 3*t)*(3*s)*(-.01) + (3*s)^2 * (.03)

 December 17, 2003                                           page 5 of 6

 SPH 5421 Final Examination         Name: ________________________________________
 5.  Levels of cortisol in a person's blood tend to vary according to
     the time of day that the blood is drawn.  Here is a graph of
     cortisol levels for individuals, plotted against time of day
     on a 24-hour clock:

 .20 |                             xxx xxx
     |                        xxxxx   x   xxx
     |                       x               xx x
 .15 |                     x                   x xx
     |                    x                       xxx  x
     |x              xxxxx                           xx xxxx
 .10 | x x         xx
     |    x       x
     |     xx  xxx
 .05 |_______xx___________________________________________
     0        4        8       12       16       20       24  time t

     A reasonable model for the expected cortisol level might be:

       E(C(t)) = a + b * cos(c + d*pi*t),

     where time t is in hours.

     a)  Describe what the parameters are in terms of the graph.

     b)  Specify further assumptions which are needed to justify using
         a least-squares procedures to obtain estimates of the para-
         meters a, b, c, and d.

     c)  What would good initial guesses be for parameters a, b,
         c, and d ?

     d)  Write a PROC NLIN program which produces least-squares
         estimates of the parameters.


     a):   a = overall mean
           b = amplitude
           c = phase offset
           d = frequency

     b):   error ~ N(0, sig^2).

     c):  a = .13, b = .08, c = 3.8, d = 3.8 = 4*pi/3 ;


         proc nlin method = marquardt ;
              pars a = .13
                   b = .08
                   c = 3.8
                   d = .08 ;

              der.a = 1 ;
              der.b = cos(c + d * pi * t) ;
              der.c = -b * sin(c + d * pi * t) ;
              der.d = - b*pi*t*sin(c + d * pi * t) ;

              f = a + b * cos(c + d * pi * t) ;

              model y = f ;

    run ;

    endsas ;


 December 17, 2003                                           page 6 of 6

 SPH 5421 Final Examination         Name: ________________________________________

 6.  Short answers:

     1)  What is an eigenvector?

         Given an n x n  matrix A, an eigenvector v is an n x 1 column vector
         such that A * v = a * v for some nonzero constant a.

     2)  What is an advantage of the simplex method of computing
         a minimum of a function ?

         Usually converges, and does not need expressions for derivatives.

     3)  What is a disadvantage of the simplex method?

         Slow to converge, does not automatically give an estimate of variance.

     4)  Suppose f(x) = 5*x - exp(x).  You can find a solution
         to f(x) = 0 by the use of Newton's method.  The key
         equation is

                 x(n + 1) = x(n)  -   ??? / ???.

                 x(n + 1) = x(n) - f(x) / f'(x)
                          = x(n) - (t*x - exp(x)) / (5 - exp(x)) ;

     5)  If X has a Poisson distribution with parameter h = 1,
         give the probabilities that:

         X = 0 : h^0 * exp(-h) / 0! = 1/e

         X = 1 : h^1 * exp(-h) / 1! = 1/e

         X = 2 : h^2 * exp(-h)/2! = 1 / (2 * e) ;
