December 22, 2004 page 1 of 8
SPH 5421 Final Examination Name: ________________________________________
=================================================================================
1. A datafile includes data on a random sample of 250 people, and three
variables are recorded for each person:
Variable 1 is ID, an identifying number
Variable 2 is gender: Gender = 'M' or 'F' (male and female)
Variable 3 is serum cholesterol
There are 100 males and 150 females in the sample represented on the file.
Let S be the set of all possible pairs of males and females; that is,
S = { (IDi, IDj) }, where IDi is the ID for a male and
IDj is the ID for a female.
(a) How many possible pairs are there in the set S ?
100 x 150 = 15000
[2]
Let M1 = the number of pairs in S for which the male person in the pair has
a lower serum cholesterol than the female person. Let M2 = the number of
pairs in S for which the female person has a lower serum cholesterol than
the male person.
(b) If there is no difference between males and females in the distribution
of serum cholesterol in the population from which the sample is drawn,
what is the expected value of M1 / (M1 + M2) ?
[2] 1/2.
(c) Write a program using proc iml that will compute M1 and M2
for a given datafile, and will compute the fraction M1 / (M1 + M2).
[16] See next page
December 22, 2004 page 2 of 8
SPH 5421 Final Examination Name: ________________________________________
=================================================================================
1. part (c) continued
data cholm cholf ;
infile 'cholmf.data' ;
input gender chol ;
if gender eq 'M' then output cholm ;
if gender eq 'F' then output cholf ;
run ;
proc iml ;
use cholm ;
read all var {chol} into males ;
use cholf ;
read all var {chol} into females ;
rm = row(males) ;
rf = row(females) ;
m1 = 0 ; m2 = 0 ;
do i = 1 to rm ;
do j = 1 to rf ;
if males(i) < females(j) then m1 = m1 + 1 ;
if males(i) > females(j) then m2 = m2 + 1 ;
end ;
end ;
estprob = m1 / (m1 + m2) ;
file 'estprob.out' ;
put ' m1 = ' m1 ' m2 = ' m2 ' estprob = ' estprob ;
quit ;
December 22, 2004 page 3 of 8
SPH 5421 Final Examination Name: ________________________________________
=================================================================================
2. Check digits are often used for numerical IDs. Assume that the main part
of the ID is a 4-digit number, like N = 7629. To compute the check digit:
Multiply the rightmost digit by: 2,
Multiply the next digit to the left by: 1,
Multiply the next digit to the left by: 2,
Multiply the next digit to the left by: 1, etc.
For 7629, the process is:
Digits of N: 7 6 2 9
Multipliers: 7 x 1 6 x 2 2 x 1 9 x 2
Products: 7 12 2 18
Add the resulting digits: 7 + 1 + 2 + 2 + 1 + 8 = 21
------
Subtract this from the next largest multiple of 10: 30 - 21 = 9
The check digit is 9.
(a) Compute the check digit for N = 8536.
Digits of N: 8 5 3 6
8 x 1 5 x 2 3 x 1 6 x 2
[2] 8 + 1 + 0 + 3 + 1 + 2 = 15
Therefore check digit is: 5.
(b) Write a macro which computes the check digit for any 3-digit number, N.
The call to the macro should look like:
%check(n, checkdig) ;
where n is the input and checkdig is the output.
[continue SEE NEXT PAGE
on next page
if needed]
December 22, 2004 page 4 of 8
SPH 5421 Final Examination Name: ________________________________________
=================================================================================
2. part (b) continued
%macro check(n, checkdig) ;
d3 = int(&n / 100) ; * 100's digit
[18] n2 = &n - 100 * d3 ;
d2 = int(n2 / 10) ; * 10's digit
d1 = n2 - 10 * d2 ; * 1's digit
d12 = d1 * 2 ; * 1's digit x 2
d122 = int(d12 / 10) ; * 10's digit of d12
d121 = d12 - 10 * d122 ; * 1's digit of d12
d21 = d2 ; * 10's digit x 1
d211 = d21 ; * same
d32 = d3 * 2 ; * 100's digit x 2
d322 = int(d32 / 10) ; * 10's digit of d32
d321 = d32 - 10 * d322 ; * 1's digit of d32
sum = d121 + d122 + d211 + d321 + d322 ; * sum of digits
summod10 = sum - 10 * int(sum/10) ; * sum of digits mod 10.
&checkdig = 10 - summod10 ; * check digit ...
&mend ;
December 22, 2004 page 5 of 8
SPH 5421 Final Examination Name: ________________________________________
=================================================================================
3. A linear transformation T from R^2 to R^2 has eigenvalues a1 = 3 and
| 1 | | 1 |
a2 = 2, and the corresponding eigenvectors are X1 = | | and X2 = | | .
| 1 | |-1 |
(a) Find the matrix of the linear transformation T.
| A B | | 1 | | 3 | | A B | | 1 | | 2 |
| | * | | = | | | | * | | = | |
| C D | | 1 | | 3 | | C D | |-1 | |-2 |
A + B = 3 A - B = 2
C + D = 3 C - D = -2 .
[12] | 2.5 0.5 |
Thus A = 2.5, B = .5, C = .5, D = 2.5. Matrix is: | |
| 0.5 2.5 |.
(b) Let S be the unit square - that is, S has vertices (0, 0), (1, 0),
(1, 1), and (0, 1). Draw a picture of T(S), specifying all of its
vertices. What is the area of T(S) ?
---(3,3)
(0, 0) ---> (0, 0) ---- /
(.5,2.5)--- /
(1, 0) ---> (2.5, 0.5) / /
/ /
(1, 1) ---> (3.0, 3.0) / /
/ /
[5] (0, 1) ---> (0.5, 2.5) / ----(2.5,.5)
/ ----
(0,0)---
Area: 2.5 x 2.5 - .5 x .5 = 6.25 - .25 = 6.
(c) What are the eigenvalues of the inverse of T ?
1/3 and 1/2.
[3]
December 22, 2004 page 6 of 8
SPH 5421 Final Examination Name: ________________________________________
=================================================================================
4. An experiment is conducted in which water is allowed to evaporate from a
number of 1-cup containers where the containers are exposed to different
temperatures. Each cup contains 8 ounces of water at the beginning. After
being exposed to a temperature Ti for Mi minutes, the amount of water
Wi remaining in the cup is weighed.
Assume that the amount of water that has evaporated is the following
function of temperature Ti and minutes exposed Mi:
Ei = a * Mi * (Ti - b) + e,
where e is a normally distributed error term, e ~ N(0, v), and where v is
an unknown constant variance. The constants a and b are also unknown and
must be estimated from the data.
Given a data file which includes Ti, Mi, and Wi, write a PROC NLIN program
which will estimate a, and b. Explain how this program will also
estimate v.
data water ;
input ti mi wi ;
ei = 8 - wi ;
run ;
proc nlin method = marquardt data = water ;
par a 1
b 1 ;
[20]
obsd = ei ;
expd = a * Mi * (Ti - b) + e,
der.a = Mi * (ti - b) ;
der.b = -a * Mi ;
model obsd = expd ;
run ;
The program prints the sum of squares. You can estimate v as a
mean square, which is the sum of squares divided by n - p. The
printout includes the mean square also.
December 22, 2004 page 7 of 8
SPH 5421 Final Examination Name: ________________________________________
=================================================================================
5. Assume X and Y are dichotomous random variables, both of which take on only
values of 0 or 1, and that each has a Bernoulli distribution,
X ~ Ber(.5)
Y ~ Ber(.8),
and X and Y are correlated: corr(X, Y) = .3.
{Recall that corr(X, Y) = cov(X, Y) / [sqrt(var(X)*var(Y))].}
(a) What are the expected cell counts and margins in the in the following
table, where 100 observations (X, Y) are made ?
X = 0 X = 1
-----------------
| | |
Y = 0 | E(a) | E(b) | E(n1)
| | |
-----------------
| | |
Y = 1 | E(c) | E(d) | E(n2)
| | |
-----------------------
E(m1) E(m2) | 100
|
Since corr(X, Y) = cov(X, Y)/(sqrt(Vx)*sqrt(Vy)),
[8] .3 = cov(X, Y) / (.5 * .4), or cov(X, Y) = .3 * .2 = .06.
Noting that cov(X, Y) = E(XY) - E(X)E(Y), we have
.06 = E(XY) - .5 * .8 = E(XY) - .4, or
E(XY) = .46.
Hence probabilities:
X = 0 X = 1
-----------------
| | |
Y = 0 | .16 | .04 | .20
| | |
-----------------
| | |
Y = 1 | .34 | .46 | .80
| | |
-----------------------
.50 .50 | 1.00
|
So the cell expectations are 16, 4, 34, and 46.
(b) Write a program which simulates 100 observations from (X, Y).
Next page ...
[12]
December 22, 2004 page 8 of 8
SPH 5421 Final Examination Name: ________________________________________
=================================================================================
5., part (b) continued.
[12] data xysim ;
n = 100 ;
p00 = .16 ;
p01 = .04 ;
p10 = .34 ;
p11 = .46 ;
psum00 = p00 ;
psum01 = p00 + p01 ;
psum10 = p00 + p01 + p10 ;
cell00 = 0 ;
cell01 = 0 ;
cell10 = 0 ;
cell11 = 0 ;
do i = 1 to n ;
r = ranuni(-1) ;
if r < psum00 then do; X = 0; Y = 0 ; end ;
if r > psum00 and r < psum01 then do; X = 1; Y = 0 ; end ;
if r > psum01 and r < psum10 then do; X = 0; Y = 1 ; end ;
if r > psum10 then do; X = 1; Y = 1 ; end ;
output ;
end ;
run ;
proc corr ;
var x y ;
title1 'Correlation of simulated X and Y ...' ;
run ;