BINOMIAL OUTCOMES SPH 7460 notes.005 Many clinical trials are conducted in which the endpoint for each participant is dichotomous (like: death, or kidney transplant failure). Assume you assign NA people to drug A and NB people to drug B. You assume a probability PA of death in the A group, and PB in the B group. At the end of the trial you count how many have died in each group. Let MA and MB be these counts. These should be binomial random variables, MA ~ binom( NA, PA) and MB ~ binom( NB, PB). To test the null hypothesis that PA = PB, you can carry out an ordinary 2 x 2 chi-square test. The formulas for power and sample size for a study of this kind are a bit more complicated than they are for a quantitative outcome trial. The reason is that the standard error of the proportion of failures in each group is a function of the failure rate itself. The derivation of the power formula for a two-sided test of the hypothesis that two proportions, PA and PB are equal, is as follows: power = prob(abs(W) > Za | altern hypoth H1), where Za is such that the probability that an N(0, 1) random variable is less than Za is 1 - alpha/2 (for example, if alpha = .05, Za = 1.96), and W = (PAhat - PBhat) / serrH0(PAhat - PBhat) = (PAhat - PBhat) / sqrt(Pbar * Qbar * (1/NA + 1/NB)). Thus prob(abs(W) > Za) = prob(W > Za | H1) + prob(W < -Za| H1). Consider the first of these two probabilities. The inequality inside is equivalent to [PAhat - PBhat - (PA - PB)] / serrH1(PAhat - PBhat) > [Za * sqrt(Pbar*Qbar(1/NA + 1/NB)) - (PA - PB)] / sqrt[PA*QA / NA + PB*QB / NB] The left side of this inequality has an N(0, 1) distribution under the alternative hypothesis. Similarly, the second inequality is equivalent to [PAhat - PBhat - (PA - PB)] / serrH1(PAhat - PBhat) < [-Za * sqrt(Pbar*Qbar(1/NA + 1/NB)) - (PA - PB)] / sqrt[PA*QA / NA + PB*QB / NB] where again the left side is N(0, 1). Note that Pbar = average of PA and PB, and Qbar = 1 - Pbar, and QA = 1 - PA, QB = 1 - PB. PROBLEM 7 1. Write a program for the power formula for a dichotomous-outcome clinical trial. Assume a two-sided test will be used. Input parameters for the program will include: 1. The desired significance level 2. PA and PB 3. NA and NB 2. Use your program to compute power for the following configurations: NA = NB = 150 PA = .6, PB = .5 alpha = .01 and alpha = .05 3. Graph the power curve assuming NA = NB = 150, alpha = .05, PA = .5, and PB ranges from .3 to .8. SIMULATION STUDIES As with trials which have a quantitative outcome, you can do simulations to check whether the power formulas are about right. For this purpose you can use the fact that SAS (or Splus) has a binomial random number generator. You don't need to simulate each participant individually; you just use the binomial random number generator once for each group in each simulated trial. PROBLEM 8 Carry out a simulation study of the power for a clinical trial with the parameters specified in Problem 7, part 2. Compute the simulated power and a 95% confidence interval for the true power. SOME MORE QUESTIONS Some statisticians think the Fisher exact test should be used whenever possible in analyzing 2 x 2 tables, instead of the chi-square statistic. The Fisher exact test tends to be more conservative than the (uncorrected) chi-square test; that is, it is less likely to give a significant result. What implications does that have for power estimates? [The Fisher exact test is based on the computation of the probability of the observed data within a 2 x 2 table, given the observed margins (row and column totals). The computation involves the hypergeometric distribution. In practice, the row and column totals are usually not fixed and the hypergeometric distribution does not reflect the distribution of the cell counts. The uncorrected chi-square statistic seems to have better operating characteristics for this situation than does the Fisher exact test.] MORE DIFFICULT QUESTION Recall that back in notes.003, there was a formula for risk of heart disease as a function of diastolic blood pressure. Suppose you wanted to carry out a clinical trial of a drug D versus placebo. You think the drug will lower blood pressure by an average of 3 mm Hg. You have a population of 1000 men aged 40-59 who have blood pressures between 80 and 110 mm Hg. In fact, the distribution of blood pressures among these men is approximately uniformly distributed between 80 and 110 mm Hg. You want to randomize half the men to drug D and half to placebo. You study them for 6 years. At the end of 6 years, you count how many in each group have new heart disease. You perform a chi-square test to compare the groups. How can you compute the power for this study? Here are some of the complicating factors: 1. The men have differing levels of risk at baseline, depending on their blood pressure. The number of men who have an event (heart disease) can be considered to be a binomial random variable in either group, but it is not necessarily very easy to compute what the binomial probability is. You would have to compute the risk for each man using the logistic formula and then average. 2. The treatment effect of the drug is also not constant. There are two problems. The first is that the average 3 mm Hg effect is not the same for all the men in drug group D. It will vary from man to man. It may be reasonable to assume the drug effect has an approximately normal distribution with a mean of 3 mm Hg and a standard deviation of 10 mm Hg. The second is that the risk for a given man is a function of both his starting DBP and his DBP after he starts taking the drug, and they do not all have the same starting DBP. How might you use the available information to compute the expected event rate in the drug group? The bottom line here is, the computation of power for a trial like this using the standard formulas is actually rather crude. Formulas for expected event rates in the two groups may be extremely complicated. You can apply the standard formulas, making some crude guesses about treatment effects. In fact that is what people usually do. However, you may want to check them by carrying out simulation studies which can take into account variability as described above and other complicating factors. ~john-c/5421/notes.005 Last update: July 8, 2000.