The CCDOR Methodology Group will present a series of 5 sessions led by Jim Hodges, "Everything is a mixed linear model". Topics, times, and locations (all rooms at the Minneapolis VA Medical Center) are given below; all presentations are on Thursdays at noon or 3:30. Each Topic has a reading. Presentations will be fairly informal but mostly aimed at people with at least a linear models course in grad school. However, Topic #3 should be of interest to a broader audience and will be less technical.

Each topic's readings are listed below under the respective topics. Two of the readings are from the excellent book "Semiparametric Regression" (the book's website contains lots of useful things), by David Ruppert, Matt P. Wand, and Ray J. Carroll (2003, Cambridge U Press), which I recommend strongly if you want to learn more about penalized splines. (I got my soft-cover copy for about $35 on Amazon.com -- cheap!)

Topic #1: Penalized splines as mixed linear models (MLMs)

10 July, noon, 3B-137

17 July, 3:30, 3E-136

Penalized splines are a way to fit smooth curves to data. They were developed in their own theoretical universe but can be expressed as MLMs and thus combined with other effects (fixed and random) and estimated using software like PROC MIXED. These two sessions develop the idea of penalized splines and fit them into the MLM framework.

Reading: Ruppert, Wand, and Carroll Chapter 3, Section 4.9, Chapter 6 sections 1-4. On a first reading, you can skip sections 3.4, 3.7, 3.8, 3.11, 3.15-3.18, and 6.3.

Transparencies: here

Topic #2: Spatial smoothing using mixed linear models.

14 August, noon, 3E-136

This class of analyses also grew up in its own universe but can be expressed as MLMs. First, I'll discuss conditional autoregressive models for areal data (where the dependent variable is the total or average over an area, e.g., county or VISN), and then present 2-dimensional penalized splines for point-referenced data or, in some cases, areal data.

Readings: The first reading is some transparencies by me about fitting the conditional autoregressive (CAR) model into the MLM framework; the second reading Ruppert et al's Chapter 13, sections 13.1 to 13.4. On a first reading, you can skip section 13.3.

Transparencies, in three parts: CAR models, 2-D splines, and example.

Topic #3: Random effects can confound the fixed effects you care about

28 August, 3:30, 3E-136

Adding spatially correlated errors or a clustering effect to a model doesn't just inflate standard errors, it also in effect adds new implicit predictors that may be collinear with the predictor you care about. This will be obvious as soon as I write down the models, but the spatial people I know find this bizarre and unsettling, and nobody seems to know that simple clustering can have this effect.

Reading: The reading is a paper by Reich, Hodges, and Zadnik (Biometrics 2006) about how adding CAR-distributed errors to an analysis confounds a fixed effect of interest. You can skip section 4 on a first reading.

Transparencies, here.

Topic #4: Random effects are not necessarily random.

11 September, noon, 3E-136

A huge range of models can be expressed and analyzed as MLMs. However, the models in Topics 1-3 would not have been recognized as random-effect models by, say, Scheffe. This conceptual quibble has real practical consequences, which this session will explore.

Reading: This reading is a little polemic Murray Clayton and I wrote which, like all polemics, is too strong but I hope it has some entertainment value and gets you thinking about our headlong rush to compute things we don't understand.

Transparencies here.

Futher materials

You might also be interested in the materials from a course I taught recently at the Division of Biostat, from which this series is mostly drawn. The stuff is just up above on this web page, above this VA series and below the addresses, under the heading "Materials for PubH8400/02 Spring 2008 'Richly Parameterized Models'". This has links to a detailed syllabus for the course, all the transparencies I used as overheads during the lectures, five datasets used as examples, and so on.