| ||||||||||||||||||||
[SBG12] A generalisation of sparse PCA to multiple correspondence analysisConférences Internationales sans actes : ERCIM 2012, Oviedo, Spain,
motcle:
Résumé:
Principal components analysis (PCA) for numerical variables and multiple correspondence analysis (MCA) for categorical variables are wellknown dimension reduction techniques. PCA and MCA provide a small number of informative dimensions: the components. However, these
components are a combination of all original variables, hence some difï¬culties in the interpretation. Factor rotation (varimax, quartimax etc.) has a long history in factor analysis for obtaining simple structure, ie looking for combinations with a large number of coefï¬cients either close to zero or to 1 or -1. Only recently, rotations have been used in Multiple Correspondence Analysis. Sparse PCA and group sparse PCA are new techniques providing components which are combinations of few original variables: rewriting PCA as a regression problem, null loadings are obtained by imposing the lasso (or similar) constraint on the regression coefï¬cients. When the data matrix has a natural block structure, group sparse PCA give zero coefï¬cients to entire blocks of variables. Since MCA is a special kind of PCA with blocks of indicator variables, we deï¬ne sparse MCA as an extension of group sparse PCA. We present an application of sparse MCA to genetic data (640 SNP’s with 3 categories measured on 502 women)and a comparison between sparse and rotated components.
Equipe:
msdma
BibTeX
|
||||||||||||||||||||