SPH 7460 Final Exam December 16, 2008 page 1 of 6 Five Problems - 6 pages Name: _________________________________________ ===================================================================================== | 1 | 1. X and Y have a bivariate distribution with mean mu = | | and covariance matrix | 3 | | 5 2 | S = | | | 2 4 |. The transformation T: R^2 --> R^2 is defined by | X | | U | | X - Y | T | | = | | = | | | Y | | V | | 2X + 3Y |. a) Find explicitly the covariance matrix of U and V. | 1 1 | Let the matrix A correspond to the transformation T. Then A = | |. | 2 3 | | 1 -1 | | 5 2 | | 1 2 | Cov((U, V)`) = Cov(T(X Y)`) = A * S * A` = | | * | | * | | | 2 3 | | 2 4 | |-1 3 | | 3 -2 | | 1 2 | | 5 0 | [5] = | | * | | = | | |16 16 | |-1 3 | | 0 80 | Note you cannot assume that X and Y are independent; in fact, cov(X, Y) = 2. b) How would you use SAS PROC IML to compute the covariance matrix of U and V ? proc iml ; S = {5 2, 2 4} ; [5] A = {1 1, 2 3} ; ASAT = A * S * A` ; print S A ASAT ; quit ; SPH 7460 Final Exam December 16, 2008 page 2 of 6 Name: _________________________________________ ===================================================================================== 1., Continued c) Find the two eigenvalues and eigenvectors of the covariance matrix S. | 5 - e 2 | det| | = (5 - e) * (4 - e) - 4 = e^2 - 9e + 16, | 2 4 - e | So e = (9 +/- sqrt(81 - 48))/2 = (9 +/- sqrt(17))/2. Let e1 = the first eigenvector = (9 + sqrt(17))/2. | X | | 5 2 | | X | | X | | e1*X | [5] S * | | = | | * | | = e1 * | | = | | | Y | | 2 4 | | Y | | Y | | e1*Y | which implies 5*X + 2*Y = e1*X and 2*X + 4*Y = e1*Y. It is sufficient to find a solution to (5 - e1)*X + 2*Y = 0, since the two equations are linearly dependent. Let X = 1; then Y = (5 - e1)/2. Thus | 1 | | | is a first eigenvector. | (5 - e1)/2 | | 1 | A second eigenvector is found similarly: | |. | (5 - e2)/2 | d) Explain how you find the eigenvalues and eigenvectors using PROC IML. proc iml ; call eigen(D, P, S) ; ***** D = vector of eigenvalues; P = matrix whose columns are the eigenvectors. print S D P ; quit ; [5] SPH 7460 Final Exam December 16, 2008 page 3 of 6 Name: _________________________________________ ===================================================================================== 2. A physician prescribes a medication which is know to have the side effect of causing ringing in the ears. Patients take one pill per day of the drug. The probability that a patient will have ringing in the ears after taking the medication is 0.03. The physician wants to know how many times, on average, a patient might take the drug until he/she has 3 episodes of ringing in the ears. Write a simulation program which will provide an estimate of this average and will also estimate the standard deviation of the number of times the patient takes the drug until 3 episodes of ringing in the ears have occurred. data threetimes ; p = .03 ; [20] nsim = 1000 ; sum = 0 ; sumsq = 0 ; do i = 1 to nsim ; count = 0 ; j = 0 ; do while count lt 3 ; j = j + 1 ; * Here you are counting the number of trials ... ; r = ranuni(-1) ; if r < .03 then count = count + 1 ; end ; sum = sum + j ; sumsq = sumsq + j * j ; output ; end ; mean = sum / nsim ; var = (sumsq - sum * sum / nsim) / (nsim - 1) ; sdev = sqrt(var) ; run ; proc means data = threetimes n mean stddev min max ; var j ; run ; SPH 7460 Final Exam December 16, 2008 page 4 of 6 Name: _________________________________________ ===================================================================================== 3. An investigator plans to take a random sample of people living in nursing homes. He will measure the height of each person. He expects that 30% of the sample will be men and 70% will be women. He expects that the men will have an average height of 70 inches (std dev = 5), and the women will average 65 inches (std dev = 4). You can assume that the heights within each gender are normally distributed. a) If he samples 500 people, what will the mean height be and what will the standard deviation be ? ---------------------------------------------------------------------------------------------- E(height) = .3 * E(men's heights) + .7 * E(women's heights) = .3 * 70 + .7 * 65 = 66.5 Var(height): [10] E(height^2) = .3 * E(men^2) + .7 * E(women^2) E(men^2) = Var(men) + (E(men))^2 = 25 + 70^2 = 4925 E(women^2) = Var(women) + (E(women))^2 = 16 + 65^2 = 4241 Therefore E(height^2) = .3 * 4925 + .7 * 4241 = 4446.2 Var(height) = E(height^2) - (E(height))^2 = 4446.2 - 66.5^2 = 23.95 StdDev(height) = sqrt(23.95) = 4.89. ---------------------------------------------------------------------------------------------- b) Write a simulation program which will produce a table like the following: Height (inches) Number of People --------------- ---------------- < 60 21 61-65 126 66-70 238 71-75 104 76-80 10 > 80 1 (Note that your program should round off the height to the nearest inch) ---------------------------------------------------------------------------------------------- [10] data heights ; probman = .3 ; meanman = 70 ; stddevman = 5 ; meanwoman = 65 ; stddevwoman = 4 ; heightlt60 = 0 ; height6165 = 0 ; height6670 = 0 ; height7175 = 0 ; height7680 = 0 ; heightgt80 = 0 ; do i = 1 to 500 ; rgender = ranuni(-1) ; if rgender < .3 then do ; height = rnd(70 + 5 * rannor(-1)) ; end ; if rgender >= .70 then do ; height = rnd(65 + 4 * rannor(-1)) ; end ; if height le 60 then heightlt60 = heightlt60 + 1 ; if height ge 61 and height le 65 then height6165 = height6165 + 1 ; if height ge 66 and height le 70 then height6670 = height6670 + 1 ; if height ge 71 and height le 75 then height7175 = height7175 + 1 ; if height ge 76 and height le 80 then height7680 = height7680 + 1 ; if height gt 80 then heightgt80 = heightgt80 + 1 ; end ; file 'heights.out' ; put 'No. of Heights le 60 inches: ' heightlt60 ; put 'No. of Heights le 61-65 inches:' height6165 ; put 'No. of Heights le 66-70 inches:' height6670 ; put 'No. of Heights le 71-75 inches:' height7175 ; put 'No. of Heights le 76-80 inches:' height7680 ; put 'No. of Heights gt 80 inches: ' heightgt80 ; run ; SPH 7460 Final Exam December 16, 2008 page 5 of 6 Name: _________________________________________ ===================================================================================== 4. Alice and Bob are having an argument. Bob says that if X ~ Binomial(.3, 100) and Y ~ Binomial(.6, 100), and X and Y are independent, then X + Y ~ Binomial(.45, 200). Alice says this is wrong. She has a proof but Bob doesn't understand it. Bob says, "I am not going to be convinced until you write a simulation program that shows I am wrong." Write a program which will indicate that Alice is right. -------------------------------------------------------------------------------------------- Note that the mean of X + Y is 0.45 * 200 = 90, which agrees with the binomial mean. The key here is that the variance of the sum X + Y will not agree with the variance of the Binomial(.45, 200). So Alice needs to write a program that estimates the variance of X + Y and she needs to show it is not the same as the variance of Binomial(.45, 200). data xplusy ; [20] n = 100 ; nsim = 1000 ; sum = 0 ; sumsq = 0 ; do i = 1 to nsim ; x = ranbin(.3, 100) ; y = ranbin(.6, 100) ; xpy = x + y ; sum = sum + xpy ; sumsq = sumsq + xpy * xpy ; end ; varxpy = (sumsq - sum * sum / nsim) / (nsim - 1) ; varbinomial = 200 * .45 * .55 ; output ; run ; proc print data = xplusy ; var varxpy varbinomial ; run ; SPH 7460 Final Exam December 16, 2008 page 6 of 6 Name: _________________________________________ ===================================================================================== 5. Assume a data file with the following structure: Obs X Y --- --- --- 1 12 3 2 16 2 . . . . . . . . . N 29 7 Write a PROC IML program which computes regression coefficients for the model E(Y) = b0 + b1 * X + b2 * X^2. Include a computation of R-square. data xy ; infile xydata ; input obs x y ; x2 = x * x ; one = 1 ; if x eq . or y eq . then delete ; run ; proc iml ; use xy ; [20] read all var {one x x2} into x ; read all var {y} into y ; n = nrow{x} ; beta = inv(x` * x) * x` * y ; ssreg = beta` * x` * y - y` * y / n ; sstot = y` * y - y` * y / n ; rsquare = ssreg / sstot ; print beta ssreg sstot rsquare ; quit ;