HSEM 3010 notes 003 January 30, 2007 When to Use Which Statistical Test? A Minimal Introduction. ----------------------------------------------------------- A number of statistical tests are used to evaluate the results of clinical trials. How do you tell which one you should use? It depends primarily on two things: the design of the trial, and the endpoint (also known as, the criterion outcome). ================================================================================= 1. If the outcome is dichotomous (e.g., yes or no, died or stayed alive) and if there are two randomization groups, the analysis is that which is appropriate for a 2 x 2 table: either the Yates-corrected chi-square test or the Fisher Exact Text. Of these two, the Fisher Exact Test (two-sided) is preferred, though as a rule they give very similar results. See notes.001. You can use the website that is referenced in notes.001 to carry out the computations. The correct degrees of freedom for a chi-square test for a 2 x 2 table is: DF = 1. 2. If the outcome has 3 or more categories (e.g., got better, stayed the same, got worse) you can use a chi-square test for an R x C table. Here R denotes the number of rows in a table (see example below, where R = 3), and C denotes the number of columns. The degrees of freedome for this test is DF = (R - 1)*)(C - 1). For the table below this is 2 * 2 = 4. Treatment A Treatment B Treatment C ----------------------------------------------- | | | | Got Better | a | b | c | n1 | | | | ----------------------------------------------- | | | | Stayed Same | d | e | f | n2 | | | | ----------------------------------------------- | | | | Got worse | g | h | i | n3 | | | | ----------------------------------------------- N m1 m2 m3 Here n1, n2 and n3 are the ROW MARGINS and m1, m2, and m3 are the COLUMN MARGINS, and N is the total number in the table. To compute the chi-square statistic, you first compute the EXPECTED VALUE for each cell in the table. The expected value is the product of the row margin and the column margin, divided by the total. For example for the 'f' cell in the table, the expected value is : F = n2 * m3 / N. You then compute the difference between the observed value and the expected value. For that same cell, this is f - F. Then you compute (f - F)^2 / F, that is, the square of the observed value minus the expected value, all divided by the expected value. You do the same thing for each cell in the table, obtaining expected values A, B, C, ..., H, I. Then you add up all of quotients of the form (obs - exp)^2 / exp for all of the cells in the table. The result is X2, the chi-square statistic: X2 = (a - A)^2/A + (b - B)^2/B + (c - C)^2/C + (d - D)^2/D + (e - E)^2/E + (f - F)^2/F + (g - G)^2/G + (h - H)^2/H + (i - I)^2/I. Another way of writing this is: X2 = sum over all cells of: (observed - expected)^2 / expected. You then compare this to the cutoff values for a chi-square statistic with (R - 1)*(C - 1) degrees of freedom in the chi-square distribution tables. There are statistical packages which do this computation. However there is no provision for doing it (except for 2 x 2 tables) in the stats web site that was referenced in notes.001, nor is it available in the usual version of Excel. Example: 2 x 3 table: --------------------- Drug A Drug B Drug C ---------------------------------- | | | | Success | 18 | 33 | 27 | | | | | ---------------------------------- | | | | Failure | 32 | 17 | 23 | | | | | ---------------------------------- Check that you get the following (this is printout from the SAS statistical package): --------------------------------------------------------------------------------- Table of row by column row column Frequency| Expected |Drug A |Drug B |Drug C | Total ---------+--------+--------+--------+ Success | 18 | 33 | 27 | 78 | 26 | 26 | 26 | ---------+--------+--------+--------+ Failure | 32 | 17 | 23 | 72 | 24 | 24 | 24 | ---------+--------+--------+--------+ Total 50 50 50 150 Statistics for Table of row by column Statistic DF Value Prob ------------------------------------------------------ Chi-Square 2 9.1346 0.0104 --------------------------------------------------------------------------------- Note here that the Chi-Square statistic has value 9.1346 with 2 degrees of freedom, and the corresponding p-value is .0104. There is no 'Yates-corrected' version of the chi-square statistic for tables larger than 2 x 2. For 2 x 2 tables, you should use either the Fisher Exact Test (preferably) or the Yates-corrected chi-square test (easier to compute). ================================================================================= 3. The previous two points dealt with counted, or categorical data. The next section deals with quantitative, or measured data (e.g., height, weight, blood pressure). For a design with 2 groups and a quantitative endpoint, usually the test statistic is the t-test. Example: Patients are randomized to drug A or drug B. The objective is to see which drug results in lower blood pressure after one month of using the drug. The summary data are the following: Drug A: ------- N : 102 Mean Diastolic BP at 6 months : 88.12 Standard deviation : 12.44 Standard error of the mean : 1.23 Drug B: ------- N : 96 Mean Diastolic BP at 6 months : 85.09 Standard deviation : 10.21 Standard error of the mean : 1.04 T-statistic (see notes.002): mean1 - mean2 88.12 - 85.09 t = ----------------------- = --------------------------- sqrt(SEM1^2 + SEM2^2) sqrt(1.23*1.23 + 1.04*1.04) t = 3.03 / sqrt(2.595) = 3.03 / 1.610 = 1.87 The degrees of freedom are 102 + 96 - 2 = 196. The two-sided p-value is: p = 0.063 This computation can be done using the website, http://www.graphpad.com/quickcalcs/ttest2.cfm ================================================================================= 4. Clinical trials involving long-term followup of a cohort of people very often have as their primary outcome the time to an event. If, for example, the objective is to show that people in Drug Group A live longer than those in Drug Group B, the outcome variable would be the person's survival time. In many cases, the study ends before you know how long many of the people in the trial are going to survive. So the time to the event (death) for these people is not observed. The time to the event is said to be CENSORED for those people who did not during the operational period of the Trial. There are statistical methods for comparing the survival patterns between the randomization groups in a clinical trial. The simplest of these is the Kaplan-Meier method of constructing the survival curve. This produces a graph of the survival times. The Kaplan-Meier test is usually accompanied by several test statistics. These may include the log-rank test, the Mann-Whitney test, and the "minus 2 log likelihood" test. All of these require extensive computation. In general people use statistical computing packages (like SAS or SPSSX or Stata) to do the tedious computations. In fact the computations are too complicated to describe here. You will see examples of Kaplan-Meier survival curves and the associated test statistics later in this course.