next up previous
Next: References

BOOK REVIEW

Bayesian Data Analysis.
Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin,
Chapman and Hall, 1995.

Bayesian and Empirical Bayes Methods of Data Analysis.
Bradley P. Carlin and Thomas A. Louis,
Chapman and Hall, 1996.

Reviewer: M. Elizabeth Halloran, Emory University
Wed Jan 29 12:02:53 CST 1997

I enjoyed the opportunity to review these two books. Both are well-written, informative, and welcome additions to the literature on practical Bayesian data analysis. There has been a growing need for books such as these. Every serious statistician, whether a Bayesian or not, needs to understand Bayesian inference to participate in any fundamental discussion of statistical inference. As advances in computation have made Bayesian methods more accessible and Bayesian models and procedures are often shown to have good frequentist properties, statisticians need to be able to judge for themselves how useful these methods might be for them. Combining information from different sources is becoming increasingly important in statistical analysis. Bayesian and empirical Bayes modeling are good methods for doing this and should be in the toolbox of every statistician. Although other excellent books on Bayesian statistics are available, these two are about the first that present the fundamentals of Bayesian inference together with modern computational methods for the purpose of practical data analysis. Thus, they are important and timely additions to applied statistics.

The two books cover much important common ground, including a brief history, priors, likelihoods, posterior and predictive inference, hierarchical models, the relation to frequentist and likelihood inference, model checking, Bayesian computation, and specific models. Once past the basics of Bayesian statistics, though, each book has a distinct character in its philosophical underpinnings, style of writing, and choice of material. Carlin and Louis cover empirical Bayes methods, decision theory, and frequentist evaluation of Bayesian procedures. In contrast, Gelman, Carlin, Stern, and Rubin present more detail on simple Bayesian model building and inference, the role of study design in Bayesian inference, and Bayesian rather than empirical Bayes hierarchical modeling.

The guiding principle of Gelman et al is that Bayesian data analysis is composed of three steps. The first step is setting up the full probability model, the second step is posterior inference based on the observed data, and third step is model checking. The book is, in general, built upon these steps, with emphasis on the first two. The 18 chapters are grouped into four main parts with two additional appendices. Part I covers Fundamentals of Bayesian Inference. Chapter 1 has useful background on Bayesian inference, some results from probability theory, and how to summarize inferences by simulation, Chapters 2 and 3 cover single--parameter models and introduction to multiparameter models, respectively. Chapter 4 demonstrates large--sample inference and connections to standard statistical methods. Part II is called Fundamentals of Bayesian Data Analysis with chapters on (5) hierarchical models, (6) model checking and sensitivity analysis, (7) study design in Bayesian analysis, and (8) an introduction to regression models. Part III is on Advanced Computation with three chapters covering (9) approximations based on posterior modes, (10) posterior simulation and integration, and (11) Markov chain simulation. Part IV has chapters on specific models including (12) models for robust inference and sensitivity analysis, (13) hierarchical linear models, (14) generalized linear models, (15) multivariate models, (16) mixture models, and (17)\ models for missing data. The book finishes in Chapter 18 with some concluding advice on how to do Bayesian data analysis. Appendix A has the usual catalogue of probability distributions needed in a Bayesian book, and Appendix B contains outlines of the asymptotic theorems.

The stated guiding philosophy of Carlin and Louis is the evaluation of procedures with mean-squared error and the bias--variance trade--off. Specifically, the idea is that Bayesian procedures can have good frequentist properties. Carlin and Louis has eight chapters and three appendices. Chapter 1 covers procedures and their properties, including the general decision problem and procedure evaluation, mean squared error, and the bias-variance trade--off. Chapters 2 and 3 cover the Bayes approach and the empirical Bayes approach, respectively. Chapter 4 is about performance of Bayes procedures from the Bayesian, frequentist, and empirical Bayes points of view. Chapter 5 is an excellent survey of Bayesian computation. Chapter 6 covers model criticism and selection. Chapter 7 presents special methods and models including ensemble estimates, nonlinear, longitudinal, time series, survival analysis and spatial and spatio--temporal models. Chapter 8 contains three advanced case studies from recent papers of the authors. Appendix A contains the catalogue of distributions, Appendix B has a guide to software currently available for Bayesian analysis, and Appendix C has answers to selected exercises.

Gelman et al's book is larger than that of Carlin and Louis. Gelman et al has 526 pages, while Carlin and Louis has 399 pages. Having measured the margins of print on a page, we calculated that Gelman et al has 9.701 m of printed material before the appendices, while Carlin and Louis has just 5.543 m. Since they do not cover empirical Bayes methods, Gelman et al have more room to discuss Bayesian data analysis in greater detail and to use more examples. The writing is excellent in Carlin and Louis, however, and the authors are able to present an amazing amount of material cogently in the smaller book.

For the novice in Bayesian inference, Chapters 1 through 3 of Gelman et al are pedagogically invaluable and reflect the emphasis of the book as a textbook. Chapters 1 and 2 are a detailed walk through model construction, posterior inference, and prediction using simple examples with data for univariate models including the binomial, normal, and Poisson models with conjugate and nonconjugate informative priors and noninformative priors. Chapter 3 demonstrates model construction for simple multivariate models, including the multivariate normal. This provides the reader ample opportunity to learn how to work with Bayesian inference in statistical analysis before jumping into advanced computational methods.

Chapters 1 through 3 of Gelman et al cover at a leisurely pace the material covered in just part of Chapter 2 of Carlin and Louis. Carlin and Louis sparingly present the normal--normal model and the Poisson-gamma models after introducing the basics of Bayesian inference. Carlin and Louis has the rather quirky approach of introducing elicited priors, including multivariate hierarchical priors motivated by a recent research paper, before introducing conjugate priors and noninformative priors. Recent computational developments and the use of the Dirichlet prior in nonparametric Bayesian models are also presented, whereas nonparametric Bayesian models are not discussed in Gelman et al.

I like that Gelman et al start out differently than many books on Bayesian statistics by explaining what Bayesian inference is before trying to say why it might be better than likelihood or frequentist inference. The comparison with likelihood and frequentist inference does not come until Chapter 4. Authors of future Bayesian books might follow their example. Carlin and Louis claim to take a pragmatic rather than dogmatic stance on the use of Bayesian methods, but allow themselves in the first chapter to present some of the standard examples that are used to show the breakdown of the sampling inference framework. The examples will fall flat, however, for the reader who is unacquainted with Bayesian inference.

The presentation of hierarchical models differs radically between the two books. Gelman et al present a purely Bayesian approach, while most discussion of hierarchical models in Carlin and Louis is in the empirical Bayes context. In the empirical Bayes approach, the parameters of the highest level prior are estimated from the data. Gelman et al view the empirical Bayes approach as an approximation to the complete hierarchical Bayesian analysis. They prefer to avoid the term because it suggests that the Bayesian approach is not empirical. Their chapter on Bayesian hierarchial models is useful both pedagogically and as a reference. The worked examples are templates for analysis of other data sets that a reader might have. Gelman et al discuss in detail the meaning of exchangeability (de Finetti, 1974), an important concept in Bayesian inference and in the use of hierarchical models. Exchangeability is not discussed by Carlin and Louis.

The chapter on empirical Bayes hierarchical methods in Carlin and Louis has a clipped pace, yet covers the material remarkably well, including means and variance problems with worked examples. The simple nonparametric empirical Bayes method of Robbins, the nonparametric likelihood empirical Bayes method, and the parametric empirical Bayes approach of Morris are presented, then compared in a simple example. Bayesians generally object to empirical Bayes methods because one uses the data twice and interval estimates tend to be liberal. Louis has made several contributions on how to adjust for this, however, and this is covered in the sections on constrained empirical Bayes. For anyone wanting to do hierarchical modeling in the empirical Bayes framework, this chapter is a must.

Once the books move toward advanced computation, I generally prefer Chapter 5 on Bayesian computation in Carlin and Louis to Part III on Advanced Computation in Gelman et al. Both books cover the standard topics of the normal distribution for symmetric approximations, Laplace's methods to allow for asymmetric posterior distributions, the EM algorithm for finding the posterior mode, rejection and importance sampling, and Markov chain Monte Carlo (MCMC) methods. Although certain sections of the computational part are well-written, Part III in Gelman et al lacks direction. The emphasis on mode-finding methods is distracting. The authors even suggest in the Introduction that a course that was short on time might cover the entire Part III in one lecture.

In contrast, Chapter 5 in Carlin and Louis has clear direction from front to back (with the EM\ algorithm relegated to the chaper on empirical Bayes methods). Carlin and Louis nicely delineate those methods where the results depend on the number of data points compared to those where the results depend on the number of simulations. The one drawback to the book is the underlying attitude that any realistic problem is going to be too high-dimensional and too complicated for anything but advanced MCMC methods to solve it. This may be true, but there is considerable pedagogical value in understanding approaches for simulating posteriors in low dimensional problems before learning methods like the Metropolis-Hastings algorithm. Methods for simulating posteriors in low-dimensional problems are presented in the early chapters of Gelman et al. Gelman et al also put the normal approximation method up front in Chapter 4 in the discussion of the relation of Bayesian to likelihood and sampling--based inference.

Rejection, importance sampling, and sampling--importance resampling (weighted bootstrap) are covered about equally well in both books. Carlin and Louis present Laplace's method including the derivation, the application, and advantages and limitations of the method. Laplace's method is an improvement over the normal approximation by allowing asymmetries. It also reduces computation considerably at the same time allowing second--order accuracy. They put the method into perspective, however, by pointing out that the method has essentially been supplanted by MCMC methods since numerical computation becomes prohibitive at high dimensions. Both the detail in presentation and perspective on the method is lacking in the Gelman et al book.

Especially in the section on MCMC methods in Carlin and Louis, the reader reaps the benefits of being in the hands of a true master (or is it a master of the universe á la Tom Wolfe?). Carlin and Louis present substitution sampling and data augmentation, then Gibbs sampling, then the Metropolis-Hastings algorithm. This ordering is pedagogically more useful than that in Gelman et al, where it is reversed. Gelman et al apparently chose to go from the general to the particular, but the result is confusing. My main criticism of the handling of Bayesian computation in Carlin and Louis is that there are no examples of data augmentation and no simple examples of Gibbs sampling or the Metropolis--Hastings algorithm. The example of Gibbs sampling is a hierarchical model for a longitudinal data set requiring samples from a Wishart distribution (luckily the sampling from which is described in the appendix of Gelman et al). This is not the first Gibbs sampler an uninitiated would want to set up, but it is an excellent pedagogical example once a person can do simple Gibbs. The worked example of the Metropolis-Hastings algorithm is also excellent for understanding some of the complexities of the method, but requires a multivariate normal proposal density. It would, however, be better to begin with an example that requires a univariate normal proposal density in the Metropolis algorithm. This allows demonstration of the dependence of the probability of accepting a candidate value and the variability in the sequence of estimates on the magnitude of the variance in the proposal density.

Both Gelman et al and Carlin and Louis describe well the diagnostic method for assessing convergence of MCMC chains based on the scale reduction developed by Gelman and Rubin (1992). The scale reduction factor of Gelman and Rubin measures the between-chain differences for independently initiated chains and should be close to 1 if the sampler is close to the target distribution. While acknowledging the popularity of the Gelman and Rubin approach, Carlin and Louis also present a general framework for thinking about convergence diagnostics, other methods that can be used, and a practical suggestion for a diagnostic strategy that uses a variety of diagnostic tools. The suggested strategy is first to run a few parallel chains started fairly far apart relative to the stationary distribution. Visual inspection of the chains on a common graph then might give evidence of nonstationarity of the chains. Each graph can be annotated with both the Gelman and Rubin scale reduction factor and lag 1 autocorrelations from within the chains. The autocorrelations help to interpret the scale reduction factors and are easily calculated. Finally, the diagnostic strategy should include examination of the crosscorrelations among parameters suspected of being nearly confounded.

The only similarity between the chapters on model checking and sensitivity analysis in the two books is the title. Model selection and sensitivity analysis are a crucial step in Bayesian analysis. Everyone agrees on that. The differences between the chapters on model checking and sensitivity analysis in the two books show, however, that no one agrees on just how to go about it. Gelman et al present three methods of model checking based on comparing (1) the posterior distribution of parameters with substantive knowledge, (2) the predictive distribution of future observations to substantive knowledge, and (3) the posterior predictive distribution of future observations to the data that have actually occurred. They introduce the Bayesian p-value, the probability that the replicated data based on the posterior predictive distribution could be more extreme than the observed data.

Carlin and Louis also cover the Bayesian p-value, but they emphasize the controversy around the notion of a Bayesian p-value and present the arguments for and against it. Carlin and Louis build their discussion in chapter 6 and part of Chapter 2 more around the idea of model selection than model checking and consider Bayes factors, prior partitioning, and predictive model selection. The Bayes factor is the ratio between the marginal likelihood under one model to the marginal likelihood under another model and is used to help choose which of competing models best describe the data. In Carlin and Louis' world, Bayes factors are ``fundamental'' for model selection. In Gelman et al's world, Bayes factors are ``rarely relevant'', since Gelman et al emphasize continuous families of models rather than discrete choices between two possible models. In prior partitioning, the posterior that produces a particular conclusion is fixed, and then it is determined which prior inputs are consistent with the desired result, given the observed data. This is actually a tool for looking at robustness, and may be included here merely as preparation for one of the case studies in Chapter 8. Our only comment is not to make a decision for either book based on its treatment of model checking.

The distributional appendix in Gelman et al has both a table and a discussion of the different distributions, and is preferable to that in Carlin and Louis, where the expressions for the distributions are combined with the prose. Especially useful in the Appendix on distributions in Gelman et al are instructions on how to simulate draws from non-standard distributions such as the multivariate normal, the Wishart (and inverse Wishart), the scaled inverse-, and the Dirichlet. For the novice in Bayesian computation, these few sentences save a tremendous amount of time and aggravation in trying to find references for these tasks.

The remainder of the two books contain mostly nonoverlapping material. In Chapter 4, Carlin and Louis present the important topic of how to evaluate frequentist performance of point estimates and confidence intervals produced using procedures developed using a Bayesian approach. It is a little surprising that Gelman et al has nothing on this topic, since Rubin has been a proponent of this approach for some time. Perhaps it did not fit in the purely Bayesian approach of their book.

Gelman et al devote a chapter to robust inference and sensitivity of inference to outliers, including the use of overdispersed versions of standard probability models and robust regression. Each of the separate chapters on regression, hierarchical linear models, generalized linear models, multivariate models, and mixture models has a good discussion of the Bayesian approach to the specific model with at least one example worked in detail.

Discussion of specific models in Carlin and Louis is condensed into Chapter 7 along with the problem of ensemble estimates. In hierarchical modeling, the scientific question of interest sometimes revolves around the ensemble histogram of the estimates for the individual units or possibly a ranking of the units rather than inference on the individual units. For example, in public health, it may be of more interest to identify the 10 states with the highest infant mortality rates than to estimate the infant mortality rates in each state. The discussion of ensemble estimates is based on using loss functions to achieve the desired goal. It is an extension of the chapter on empirical Bayes methods and picks up the threads of decision theory from the first chapter. The section makes use of the constrained empirical Bayes approach presented in the earlier chapter. The ideas related to ensemble estimates are not presented in Gelman et al.

The three case studies in Chapter 8 of Carlin and Louis are virtuoso, state--of--the--art applications of advanced Bayesian methods. The first is an analysis of longitudinal AIDS data. The second is a robust analysis of a clinical trial that illustrates the role of the prior elicitation process. The third illustrates spatio-temporal mapping of lung cancer rates. Since the data are not included, the examples cannot be replicated as pedagogical exercises. The appendix in Carlin and Louis on currently available Bayesian software, could be, as they admit, outdated very rapidly. Hopefully current discussion to integrate more Bayesian methods into Splus will come to fruition.

Very special in the Gelman et al book is the presentation of the potential outcomes approach to causal inference, the role of study design in Bayesian inference, and the missing data chapters. These ideas are crucial to the interpretation of analyses that we do every day. Statisticians can easily get caught up in estimation and computation without stopping to think about the meaning of what they are estimating. It is admirable that Gelman et al\ chose to include the material in the book. In addition to presenting an overview of the generality of the observed-- and missing--data paradigm, Chapter 7 challenges the standard Bayesian notion that the method of data collection is irrelevant to Bayesian analysis and discusses the role of randomization in Bayesian analysis. The chapter also summarizes the distinction between finite population and superpopulation inference. Chapter 17 has a synopsis of concepts of missing and observed at random and the nuts and bolts of multiple imputation that is good as a concise reference for missing data problems.

Since the books are written by active researchers with some overlapping interests, not surprisingly there are subtle differences of attribution of who did what in the two books. These will be obvious to people reading the books who are in the know, and they may produce an occasional smile. Luckily, the differences are not obvious enough to disturb readers unfamiliar with the personalities involved. In general, Carlin and Louis do a better job of presenting a balanced view where there are controversies, such as in model checking and assessing convergence. Despite the differences in the material, both books have extensive bibliographies. Thus even when one book or the other does not cover a topic, the appropriate references are generally included.

The books differ in their characters as textbooks and references as well as in the background level they assume in the reader. Both books aspire in their introductions to be textbooks as well as reference volumes. Historically, Gelman et al evolved as a textbook and was used as teaching material at several universities. It is, however, also a rich reference for Bayesian analysis. Carlin and Louis more nearly resembles two monographs put into one book. The one part is on empirical Bayes methods and guiding principles of Bayesian procedures and frequentist evaluation, the other is on Bayesian methods and computation, with a grand finale of case studies at the end. Carlin and Louis requires more background of the reader than does Gelman et al. Carlin and Louis admit in the preface that many of the details need to be filled in by going to the library. This may stand in the way of Carlin and Louis being a user-friendly introductory text, but the book does present the state-of-the-art in Bayesian and empirical Bayes methods. It is an excellent reference book and can be used as a more advanced textbook or together with Gelman et al.

In teaching a half semester course, I used the first six chapters and half of Chapter 7 of Gelman et al. I then turned to Chapter 5 of Carlin and Louis for computational methods. The more leisurely pace of the opening chapters in Gelman et al is easier on the teacher as well as the students than is Carlin and Louis. It is a useful exercise to reproduce the example analyses presented in the text in both books. The examples presented in Gelman et al were done in Splus, which makes it convenient as a teaching tool. The data sets used in Gelman et al are also now available from Gelman's web site at Columbia. I preceded the leap to the advanced level of the MCMC methods in Chapter 5 of Carlin and Louis with Casella and George's (1992) ``Gibbs for Kids'' and some excellent introductory material on the Metropolis algorithm from the Kass and Wasserman (1995) short course manuscript. For teaching data augmentation and the EM algorithm, including Louis' calculation of the missing information, I use Tanner (1993). Bernardo and Smith (1994) has the most complete appendix on conjugate distributions. In the future, I would look for an alternative to Chapter 6 in Gelman et al for discussing model checking. In a course that covered decision theory, empirical Bayes methods, or evaluation of procedures, chapters 1, 3, and 4 of Carlin and Louis would be quite valuable.

The books by Gelman, et al and Carlin and Louis are important contributions to practical Bayesian and empirical Bayes data analysis. Because of the differences in level and choice of material, the two books complement each other quite well both as references and as textbooks. They will likely wind up together on many statistician's shelves.





next up previous
Next: References



Brad Carlin
Wed Jan 29 12:02:41 CST 1997