[GS16a] Variable selection for longitudinal biomarkers constrained by a detection limit
Conférence Internationale avec comité de lecture :
Compstat 2016,
August 2016,
pp.23-24,
Oviedo,
Spain,
Mots clés: detection limit, missing values, GEE
Résumé:
Repeated measures over time are common in the biomedical field, and widely used to analyze the link between covariates and a clinical criterion.
In a longitudinal context, a high number of variables associated with the presence of missing data, are complex issues to be resolved. We deal
with several types of covariates, some suffer from haphazard missingness, and others are subject to detection thresholds. For the latter, Tobit
regression combined with bootstrap is an unbiased approach, but it needs complete predictors for the mean model. An adaptation of the wellknown
multivariate imputation by chained equation is proposed. We use the Tobit model as the imputation method for covariates below the detection limit, predictive mean matching and logistic regression for others. Variable selection is done by using MI-PGEE which consists in the
following ingredients: a) a group LASSO penalty is imposed on the group of estimated regression coefficients of the same variable across multiplyimputed
datasets leading to a consistent selection. The optimal shrinkage parameter is chosen by minimizing a BIC-like criterion. b) GEE allows
integrating correlations due to the longitudinal context. The usefulness of the new method is illustrated by an application on the FNIH project of
the Osteoarthritis Initiative.
c
Commentaires:
22nd International Conference on Computational Statistics
23-26 August