Rechercher

[Sap13a] From Sparse Regression to Sparse Multiple Correspondence Analysis

Conférences invitées : European Conference on Data Analysis, July 2013, pp.25, Luxembourg, Luxembourg,

Auteurs: G. Saporta

Mots clés: LASSO, SPARSE REGRESSION, SPARSE MCA

Résumé: High dimensional data means that the number of variables p if far larger than the number of observations n. This talk starts from a survey of various solutions in linear regression . When p > n the OLS estimator does not exist . Since it is a case of forced multi- collinearity, one may use regularized techniques such as ridge regression, principal component regression or PLS regression which keep all the predictors. However if p >> n combinations of all variables cannot be interpreted. Sparse so- lutions, ie with a large number of zero coecients, are preferred. Lasso, elastic net, sparse PLS perform simultaneously regularization and variable selection thanks to non quadratic penalties: L1, SCAD etc. In PCA, the singular value decomposition shows that if we regress principal com- ponents onto the input variables, the vector of regression coecients is equal to the factor loadings. It suces to adapt sparse regression techniques to get sparse ver- sions of PCA and of PCA with groups of variables. We conclude by a presentation of a sparse version of Multiple Correspondence Analysis and give several applications.

Commentaires: 10-12 juillet 2013 Organized by the German Classification Society (GfKl) and the French speaking Classification Society (SFC) ISBN 978-2-87971-105-8

Equipe: msdma

BibTeX

@inproceedings {
Sap13a,
title="{From Sparse Regression to Sparse Multiple Correspondence Analysis }",
author=" G. Saporta ",
booktitle="{European Conference on Data Analysis}",
year=2013,
month="July",
pages="25",
address="Luxembourg, Luxembourg",
note="{10-12 juillet 2013 Organized by the German Classification Society (GfKl) and the French speaking Classification Society (SFC) ISBN 978-2-87971-105-8}",
}