Non-parametric and Semiparametric Models for Missing Covariates in Parametric Regression

Hua Yun Chen
Department of Epidemiology and Biostatistics
University of Illinois at Chicago

Wednesday, February 5, 2003
3:30 PM
Moos 2-620
Minneapolis Campus

Abstract:

Robustness of the covariate modeling approach to missing covariate problems in parametric regression is studied under the MAR assumption. For a simple missing covariate patter, non-parametric likelihood is proposed and shown to yield a consistent and semiparametrically efficient estimator for the regression parameter. Total robustness is achieved in this situation. For more general missing covariate patterns, novel semiparametric models are proposed for modeling missing covariates. In this modeling approach, the covariate distribution is first decomposed into the product of a series of conditional distributions according to the overall missing data patterns and the conditional distributions are then represented in the general odds ratio form. The general odds ratios are modeled parametrically and the other components of the covariate distribution are modeled nonparametrically. Maximum semiparametric likelihood is proposed to find the parameter estimates. The proposed method yields a consistent estimator for the regression parameter when the odds ratios are modeled correctly. In general, the semiparametric covariate modeling approach increases the robustness against covariate model misspecification when compared with the parametric modeling approach of Lipsitz and Ibrahim. The no covariate modeling approach can also be incorporated into the doubly robust procedure of Robins et al to increase the level of protection against the misspecification of missing data mechanisms. Furthermore, the proposed modeling strategy avoids the usually intractable integrations that are involved in the maximization of the incomplete data likelihood with parametric covariate models. The proposed method can be applied to many frequently used regression models.