SEMINAR

An Approach to Diagnostics for Multiple Error-term Linear Models

Jim Hodges
Division of Biostatistics/School of Dentistry
University of Minnesota
*Candidate for the Assistant/Associate Professor Position

Tuesday, February 28th
10:00am
Moos 2-620
Minneapolis Campus

Abstract:
In any statistical analysis, an inferential summary is a function mapping the data to some value of the summary. In linear regression, for example, coefficient estimates are a function of the outcome vector y and design matrix X. Regression diagnostics allow essentially complete understanding of such functions for models with a mean structure linear in its unknowns, with each outcome measure contaminated by a single independent normally-distributed error. These diagnostics, developed mostly in the 1970s, allow users to stop taking inferential summaries on faith and instead to fit linear models with reasonable confidence that the summaries are not distorted by a few anomalous cases or by inappropriate mathematical assumptions. The power of these methods has two deep sources: linear-model fits treated as orthogonal projections, and algebraic results permitting, for example, rapid computing for case deletions.
Linear models with more than one error term have been used for well over 50 years, but their use grew with computer speed in the 1980s and exploded in the 1990s with the advent of Markov chain Monte Carlo (MCMC). However, people using these models are in roughly the same unhappy position as regression users before 1970: their fitting methods, the functions that turn data into summaries, are ill-understood, and anyone with a deadline has little choice but to take their computer output on faith. Even worse, pitfalls lie not only in the equations that turn data into summaries, but in the MCMC routines that compute the summaries.
For the past 13 or so years, I have worked to adapt the geometric and algebraic insights of regression diagnostics to provide similar tools for hierarchical models, conditionally autoregressive (CAR) smoothers, and many others that can be expressed as multiple-error-term linear models. This talk surveys work by me, my students, and my collaborators, and indicates unsolved problems, including implementation in easily-used software.

A social tea will be held at 9:30A.M. in A434 Mayo. All are Welcome.
For more details contact 612-624-4655 or see http://www.biostat.umn.edu/seminar_academic.html