[BNP17] Clusterwise Sparse PLS
Conférences Internationales sans actes :
PLS'17,
Macao,
Chine,
Mots clés: PLS regression, sparse models, clusterwise methods
Résumé:
PLS regression is a successful method when predictors are correlated for it provides stable regression coefficients. It has often been claimed that one of its advantages (like PCR or ridge regression) is that the model uses all predictors even in the case where the number of observations is smaller than the number of predictors: p>n. However in the case of high dimensional data where p is much larger than n, this advantage becomes a drawback due to the lack of interpretability of linear combinations of thousands of variables, as it is met in genomics. Sparse PLS (sPLS) and group sparse PLS have been proposed to overcome this problem by using Lasso type constraints on the parameters. On another hand, when one has a large number of observations, it is frequent that unobserved heterogeneity occurs, which means that there is no single model, but several local models: one for each cluster of a latent variable. Clusterwise methods optimize simultaneously the partition and the local models; they have been already extended to PLS regression.
The originality of this paper is to present a combination of clusterwise PLS and sPLS which is well fitted for big data : large n , large p.
Commentaires:
9th International Conference on PLS and Related Methods 17-19 juin 2017
Collaboration:
ANSES