[Sap13a] From Sparse Regression to Sparse Multiple Correspondence Analysis

Conférences invitées : European Conference on Data Analysis, July 2013, pp.25, Luxembourg, Luxembourg,

Auteurs: G. Saporta

Mots clés: LASSO, SPARSE REGRESSION, SPARSE MCA

Résumé: High dimensional data means that the number of variables p if far larger than the number of observations n. This talk starts from a survey of various solutions in linear regression . When p > n the OLS estimator does not exist . Since it is a case of forced multi- collinearity, one may use regularized techniques such as ridge regression, principal component regression or PLS regression which keep all the predictors. However if p >> n combinations of all variables cannot be interpreted. Sparse so- lutions, ie with a large number of zero coecients, are preferred. Lasso, elastic net, sparse PLS perform simultaneously regularization and variable selection thanks to non quadratic penalties: L1, SCAD etc. In PCA, the singular value decomposition shows that if we regress principal com- ponents onto the input variables, the vector of regression coecients is equal to the factor loadings. It suces to adapt sparse regression techniques to get sparse ver- sions of PCA and of PCA with groups of variables. We conclude by a presentation of a sparse version of Multiple Correspondence Analysis and give several applications.

Commentaires: 10-12 juillet 2013 Organized by the German Classification Society (GfKl) and the French speaking Classification Society (SFC) ISBN 978-2-87971-105-8

Equipe: msdma

Download

BibTeX

@inproceedings {
	Sap13a,
	title	=	"{From Sparse Regression to Sparse Multiple Correspondence Analysis }",
	author	=	" G. Saporta ",
	booktitle	=	"{European Conference on Data Analysis}",
	year	=	2013,
	month	=	"July",
	pages	=	"25",
	address	=	"Luxembourg, Luxembourg",
	note	=	"{10-12 juillet 2013 Organized by the German Classification Society (GfKl) and the French speaking Classification Society (SFC) ISBN 978-2-87971-105-8}",
}