Jim Hodges

Associate Professor, Division of Biostatistics, University of Minnesota

I'm the one on the right. The one on the left is Li Chi-ping, who became my bride on 13 February 2007. (Photo taken March 2006, Taipei).

My location:

  • Division of Biostatistics
  • School of Public Health
  • University of Minnesota
  • 2221 University Ave SE, Suite 200
  • Minneapolis, Minnesota 55414
  • Phone (612) 626-9626, Fax (612) 626-9054
  • e-mail: hodges@ccbr.umn.edu, hodge003@umn.edu (they go to the same inbox)

    Curriculum Vitae (or whatever "CV" stands for)

    Having trouble sleeping? Take a look at Jim's vita, current as of 27 April 2012.

    Regarding the Diversity Visa Lottery 2012

    A link on this web page is being used as an argument for a particular position regarding the Diversity Visa Lottery 2012. I have received two e-mail messages from people in the Netherlands on this matter. The two messages and my responses to them can be viewed here.

    I know nothing about the Diversity Visa Lottery apart from what was contained in the two messages that were sent to me. However, it is clear that my work has no relevance to this issue. If you disagree with me about this (e.g., the second e-mail message at the link above), my response can be seen at the link above.

    I will not respond to any more messages about this matter. Please do not send any more such messages.

    31 May 2011


    Manual of Operations, OPT Study, version 1, free for you to download and crib! Use it at your own risk.

    Materials for PubH8492 Spring 2012 "Richly Parameterized Models".

  • This is a mixture of materials from previous offerings of this course and the current offering. Updated materials are indicated by "Updated [date]", though some of the 2008 materials did not require updating.

    Official syllabus, updated November 2011 and current for the 2012 offering.

    Suggestions for class projects updated 1/18/10.

    Papers assigned as reading, updated for 2012 offering

  • Cui, Hodges, Kong, Carlin Technometrics 2010, degrees of freedom in generality
  • Cui & Hodges submitted to Statistica Sinica, smoothed ANOVA general case
  • Hodges and Reich, The American Statistician 2010, more on spatial confounding; supplement to the TAS article.
  • Hodges JRSSB (1998), diagnostics for hierarchical models
  • Hodges and Clayton (2010) manuscript "Random Effects Old and New", submitted to J. Royal Stat. Sci. Series A, and used in Part II, Section C.
  • Hodges, Cui, Sargent, Carlin Technometrics 2007, smoothed ANOVA for balanced, single-error-term ANOVAs
  • Peterson et al J. Structural Bio 2001, paper reporting viral-structure data
  • Reich and Hodges JSPI 2008, laying bare the deep structure of hierarchical models, or at least that's what we thought until the drugs wore off
  • Reich et al JASA 2007, modeling periodontal data with CAR models having two classes of neighbor pairs Research report version, rr2004-004
  • Reich, Hodges, Zadnik Biometrics 2006, on spatial confounding (the Slovenia paper); students in previous classes helpfully pointed out these known typos. Feel free to tell me about other typos that you find.
  • Reich & Hodges Biometrics 2008, Spatially-adaptive CAR
  • Zhang, Hodges, Banerjee, Annals of Applied Stat 2009, smoothed ANOVA with spatial smoothing for one factor, as a competitor to MCAR models

    Transparencies used in lectures; labels for files refer to the detailed syllabus.

  • Part I, Section A, 1, 2a (through conventional analyses) here; error on page IA1/13
  • Part I, Section A, 2b, 3, 4 (through the end of IA) here, updated 1/22/10
  • Part I, Section B (constraint-case formulation; measures of complexity) here, updated 1/24/12; correction to page IB3/5 here, posted 2/1/12;
  • Part I, Section B thesis topics, Section C here, updated 1/28/10; here are some recap slides used in the 2/7/12 lecture.
  • Part I, Section D except discrete-by-discrete interactions here, updated 1/25/12;
  • Part I, Section D, discrete-by-discrete interactions (smoothed ANOVA) here, updated 2/1/10
  • Part I, Section E, spatial smoothing 1 (CAR smoothing on a lattice) here, updated 2/3/10
  • Part I, Section E, spatial smoothing 2 (2D penalized splines) here, updated 2/3/10
  • Part I, Section F, time series (dynamic linear models, Kalman filter-style models), Part I, Section G, two alternative syntaxes (Rue & Held; Lee, Nelder, & Pawitan) here, updated 2/1/12.
  • Part II, Section A, Simple extensions of linear-model diagnostics here, updated 2/2/12.
  • Part II, Section B, Collinearity/Confounding and Smoothing/Shrinkage (beginning) here, updated 2/6/12.
  • Part II, Section B, Collinearity & smoothing: Adding a random effect can zap a fixed effect or another random effect here, updated 2/9/12. Spurred by student questions, I've added some more slides about the kids'n'crowns puzzle here, 4/3/12.
  • Part II, Section C, Old- vs. new-style random effects: The difference has practical implications. These are the lecture transparencies, updated 3/31/10. There's also a draft paper dated 3/19/10.
  • Part II, Section D, Teaser for identifying variance parameters and other mysteries, here, updated 2/15/12. Sorry about the lousy quality of the pictures, good copies are all in the research-report version of Reich, Hodges, & Carlin JASA 2007
  • Part II, Section D, Identifying variances and other mysteries: CAR model on perio data here, updated 2/17/12.
  • Part II, Section D, Identifying variances and other mysteries: penalized spline on the GMST data here, revised 2/20/12.
  • Part II, Section D, Identifying variances and other mysteries: why CAR on the GMST data gives such a lousy fit here, revised 2/23/12.
  • Part II, Section D, Identifying variances: more general models (what little we can say about them) here, revised 2/28/12. Sorry about the lousy picture quality, I'll give out better ones in class.
  • Part II, Section E, two last oddities from real datasets: bimodal posteriors, and a case in which increasing the sample size makes the standard error of the average increase (yes, increase); here, revised 3/5/12.

    Homework assignments

  • #1, updated 1/22/10
  • #2, updated 2/1/12
  • #3, updated 2/9/12
  • #4, updated 2/17/12
  • #5, posted 3/1/12

    Datasets, updated for the 2012 offering

  • Molecular structure of a virus Excel file.
  • Vocal folds Excel file.
  • Global mean surface temperature deviations (in 0.01 degrees C) Text file, Excel file.
  • Physical properties of pig jawbone .csv file
  • Soft material polishability data -- as in Appendix B of Hodges et al 2007 -- columns are not scaled! Excel file
  • HMO premium dataset text file

    Materials from Summer 2008 VA Methodology Group series: "Everything is a Mixed Linear Model".

    The CCDOR Methodology Group will present a series of 5 sessions led by Jim Hodges, "Everything is a mixed linear model". Topics, times, and locations (all rooms at the Minneapolis VA Medical Center) are given below; all presentations are on Thursdays at noon or 3:30. Each Topic has a reading. Presentations will be fairly informal but mostly aimed at people with at least a linear models course in grad school. However, Topic #3 should be of interest to a broader audience and will be less technical.

    Each topic's readings are listed below under the respective topics. Two of the readings are from the excellent book "Semiparametric Regression" (the book's website contains lots of useful things), by David Ruppert, Matt P. Wand, and Ray J. Carroll (2003, Cambridge U Press), which I recommend strongly if you want to learn more about penalized splines. (I got my soft-cover copy for about $35 on Amazon.com -- cheap!)

    Topic #1: Penalized splines as mixed linear models (MLMs)

    10 July, noon, 3B-137

    17 July, 3:30, 3E-136

    Penalized splines are a way to fit smooth curves to data. They were developed in their own theoretical universe but can be expressed as MLMs and thus combined with other effects (fixed and random) and estimated using software like PROC MIXED. These two sessions develop the idea of penalized splines and fit them into the MLM framework.

    Reading: Ruppert, Wand, and Carroll Chapter 3, Section 4.9, Chapter 6 sections 1-4. On a first reading, you can skip sections 3.4, 3.7, 3.8, 3.11, 3.15-3.18, and 6.3.

    Transparencies: here

    Topic #2: Spatial smoothing using mixed linear models.

    14 August, noon, 3E-136

    This class of analyses also grew up in its own universe but can be expressed as MLMs. First, I'll discuss conditional autoregressive models for areal data (where the dependent variable is the total or average over an area, e.g., county or VISN), and then present 2- dimensional penalized splines for point-referenced data or, in some cases, areal data.

    Readings: The first reading is some transparencies by me about fitting the conditional autoregressive (CAR) model into the MLM framework; the second reading Ruppert et al's Chapter 13, sections 13.1 to 13.4. On a first reading, you can skip section 13.3.

    Transparencies, in three parts: CAR models, 2-D splines, and example.

    Topic #3: Random effects can confound the fixed effects you care about

    28 August, 3:30, 3E-136

    Adding spatially correlated errors or a clustering effect to a model doesn't just inflate standard errors, it also in effect adds new implicit predictors that may be collinear with the predictor you care about. This will be obvious as soon as I write down the models, but the spatial people I know find this bizarre and unsettling, and nobody seems to know that simple clustering can have this effect.

    Reading: The reading is a paper by Reich, Hodges, and Zadnik (Biometrics 2006) about how adding CAR-distributed errors to an analysis confounds a fixed effect of interest. You can skip section 4 on a first reading.

    Transparencies, here.

    Topic #4: Random effects are not necessarily random.

    11 September, noon, 3E-136

    A huge range of models can be expressed and analyzed as MLMs. However, the models in Topics 1-3 would not have been recognized as random-effect models by, say, Scheffe. This conceptual quibble has real practical consequences, which this session will explore.

    Reading: This reading is a little polemic I wrote which, like all polemics, is too strong but I hope it has some entertainment value and gets you thinking about our headlong rush to compute things we don't understand.

    Transparencies here.

    Futher materials

    You might also be interested in the materials from a course I taught recently at the Division of Biostat, from which this series is mostly drawn. The stuff is just up above on this web page, above this VA series and below the addresses, under the heading "Materials for PubH8400/02 Spring 2008 'Richly Parameterized Models'". This has links to a detailed syllabus for the course, all the transparencies I used as overheads during the lectures, five datasets used as examples, and so on.


    Papers you can download

  • A postscript version of Sargent DJ, Hodges JS, Smoothed ANOVA with application to subgroup analysis, submitted to JASA some years ago and rejected with enthusiasm. A completely reworked and much better version has been accepted by Technometrics; the Research Report version is rr2005-018 on the U of MN Biostat web site.

  • A postscript version of Hodges JS, Sargent DJ, "Counting degrees of freedom in hierarchical and other richly parameterized models". The original version has some interesting stuff that's not in the Biometrika version (Hodges JS, Sargent DJ. Counting degrees of freedom in hierarchical and other richly-parameterised models. Biometrika, 88:367-379, 2001).

  • Nick Salkowski's class project, applying the SemiPar package to various things including the Slovenia stomach-cancer data.

    Items associated with "Some algebra and geometry for hierarchical models, applied to diagnostics" (JRSSB 1998)

    Dataset: This dataset is in ASCII format. The file containing the dataset has two matrices -- one for plan-level data and one for state-level data -- and some introductory material.

    S+ functions: Peiming Ma has written S+ functions to execute the analyses in this paper. You can get separate postscript files for: documentation 1 and documentation 2, and ASCII files containing the functions for: gibbs sampler, trace plots, added-variable plot, collinearity check, case influence, residuals, and transformations. Although we have tested these functions and they work as far as we know, USE THEM AT YOUR OWN RISK! Also, we make no claims to efficiency or exemplary programming style, but they do appear to work.


    Last updated: April 2012.


    Return to U of M Biostat home page.


    Official Disclaimer: The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been approved by the University of Minnesota.

    Unofficial Disclaimer: This is all my fault. The U is blameless. They're such nice people, how could you even think of blaming them! Shame on you!