Data mining and statistics

The team’s activities concern the processing of data by statistical and computer methods. The unifying concept is that of “data mining”, a discipline that has emerged in recent years at the border of statistics, artificial intelligence and databases and aims to discover relationships and structures in preexisting data.

The theory of learning gives to data mining its conceptual foundations: we traditionally distinguish supervised or unsupervised learning.

Publications

2022

Articles de revue

  1. Bry, X.; Niang, N.; Verron, T. and Bougeard, S. Clusterwise elastic-net regression based on a combined information criterion. In Advances in Data Analysis and Classification, 2022. doi  www 

2021

Articles de revue

  1. Bar-Hen, A.; Gey, S. and Poggi, J-M. Spatial CART Classification Trees. In Computational Statistics, 2021. doi  www 
  1. Djennane, N.; Yacoub, M.; Aoudjit, R. and Bouzefrane, S. CPU-based prediction with Self Organizing Map in Dynamic Cloud Data Centers. In International Journal of Sensors, Wireless Communications and Control, 11 (7): 733-747, 2021. doi  www 
  1. Huang, T.; Saporta, G.; Wang, H. and Wang, S. A robust spatial autoregressive scalar-on-function regression with t-distribution. In Advances in Data Analysis and Classification, 15 (1): 57-81, 2021. doi  www 
  1. Boukela, L.; Zhang, G.; Yacoub, M.; Bouzefrane, S.; Bagheri, S. and Jelodar, H. A modified LOF based approach for outlier characterization in IoT. In Annals of Telecommunications - annales des télécommunications, 76 (3-4): 145-153, 2021. doi  www 
  1. Moins-Teisserenc, H.; Cordeiro, D. J.; Audigier, V.; Ressaire, Q.; Benyamina, M.; Lambert, J.; Maki, G.; Homyrda, L.; Toubert, A. and Legrand, M. Severe Altered Immune Status After Burn Injury Is Associated With Bacterial Infection and Septic Shock. In Frontiers in Immunology, 12: 586195, 2021. doi  www 

Articles de conférence

  1. Diallo, A. W.; Niang, N. and Ouattara, M. Sparse Subspace K-means. In 3rd IEEE ICDM Workshop on Deep Learning and Clustering. In conjunction with IEEE ICDM 2021 December 7-10, 2021., pages 678-685, IEEE, Auckland, New Zealand, 2021. doi  www 
  1. Audigier, V.; Niang, N. and Resche-Rigon, M. Clustering sur données incomplètes~: quel modèle d'imputation choisir~?. In EPICLIN 2021 -- 15e Conférence francophone d'épidémiologie clinique -- 28e Journées des statisticiens des centres de lutte contre le cancer, pages S21-S22, Elsevier Masson, Marseille, France, 2021. doi  www 

2020

Articles de revue

  1. Wang, H.; Liu, R.; Wang, S.; Wang, Z. and Saporta, G. Ultra-high dimensional variable screening via Gram--Schmidt orthogonalization. In Computational Statistics, 35: 1153-1170, 2020. doi  www 

2019

Articles de revue

  1. Huang, T.; Wang, H. and Saporta, G. 成分数据的空间自回归模型. In Journal of Beijing University of Aeronautics and Astronautics, 45 (1): 93-98, 2019. doi  www 
  1. Graffeo, N.; Latouche, A.; Le Tourneau, C. and Chevret, S. ipcwswitch: An R package for inverse probability of censoring weighting with an application to switches in clinical trials. In Computers in Biology and Medicine, 111: 103339, 2019. doi  www 
  1. Wang, H.; Gu, J.; Wang, S. and Saporta, G. Spatial partial least squares autoregression: Algorithm and applications. In Chemometrics and Intelligent Laboratory Systems, 184: 123-131, 2019. doi  www 
  1. Austin, P.; Latouche, A. and Fine, J. A review of the use of time-varying covariates in the Fine-Gray subdistribution hazard competing risk regression model. In Statistics in Medicine, 2019. doi  www 

Articles de conférence

  1. Ben-younes, H.; Cadene, R.; Thome, N. and Cord, M. BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection. In AAAI 2019 - 33rd AAAI Conference on Artificial Intelligence, Honolulu, United States, 2019. www 

2018

Articles de revue

  1. Audigier, V.; White, I.; Jolani, S.; Debray, T.; Quartagno, M.; Carpenter, J.; van Buuren, S. and Resche-Rigon, M. Multiple Imputation for Multilevel Data with Continuous and Binary Variables. In Statistical Science, 33 (2): 160-183, 2018. doi  www 
  1. Bougeard, S.; Abdi, H.; Saporta, G. and Niang, N. Clusterwise analysis for multiblock component methods. In Advances in Data Analysis and Classification, 12 (2): 285-313, 2018. doi  www 
  1. Wei, Y.; Wang, H.; Wang, S. and Saporta, G. Incremental modelling for compositional data streams. In Communications in Statistics - Simulation and Computation, 48 (8): 2229-2243, 2018. doi  www 
  1. CHEVALIER, M.; Thome, N.; Henaff, G. and Cord, M. Classifying low-resolution images by integrating privileged information in deep CNNs. In Pattern Recognition Letters, 116: 29-35, 2018. doi  www 
  1. Bougeard, S.; Cariou, V.; Saporta, G. and Niang, N. Prediction for regularized clusterwise multiblock regression. In Applied Stochastic Models in Business and Industry, 34 (6): 852-867, 2018. doi  www 

Articles de conférence

  1. Durand, P.; Ghorbanzadeh, D. and Jaupi, L. Index Theorem and Applications, a Gentle Review. In CMCGS 2018. 7th Annual International Conference on Computational Mathematics, Computational Geometry, pages pp.1-6, Digital Library, Singapore, Singapore, Series Proc. Computational Mathematics Computational Geometry and Statistics (CMCGS) , 2018. doi  www 
  1. Durand, P.; Ghorbanzadeh, D. and Jaupi, L. Different approaches for the texture classification of a remote sensing image bank. In Ninth International Conference on Graphic and Image Processing, pages 1-9, SPIE, Qingdao, China, 2018. doi  www 
  1. Robert, T.; Thome, N. and Cord, M. HybridNet: Classification and Reconstruction Cooperation for Semi-supervised Learning. In ECCV 2018 - 15th European Conference on Computer Vision, pages 158-175, Springer, Munich, Germany, Lecture Notes in Computer Science 11211, 2018. doi  www 

2017

Articles de revue

  1. Liberati, C.; Camillo, F. and Saporta, G. Advances in credit scoring: combining performance and interpretation in kernel discriminant analysis. In Advances in Data Analysis and Classification, 11 (1): 121-138, 2017. doi  www 
  1. Geronimi, J. and Saporta, G. Variable selection for multiply-imputed data with penalized generalized estimating equations. In Computational Statistics and Data Analysis, 110: 103-114, 2017. doi  www 

Articles de conférence

  1. Ben-younes, H.; Cadene, R.; Cord, M. and Thome, N. MUTAN: Multimodal Tucker Fusion for Visual Question Answering. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2631-2639, IEEE, Venice, Italy, 2017 IEEE International Conference on Computer Vision (ICCV) , 2017. doi  www 

Communications

Publications

2021

Articles de conférence

  1. Fateri Gouard, N.; Niang, N. and Ouattara, M. Unbiased Feature selection in Random Forests using Consensus Feature Clustering. In Data Science, Statistics & Visualisation(DSSV) and European Conference on Data Analysis (ECDA), Rotterdam, Netherlands, 2021. www 
  1. Bougeard, S.; Bry, X.; Verron, T. and Niang, N. Combined-information criterion for clusterwise elastic-net regression. Application to omic data. In 8th Channel Network Conference, Paris, France, 2021. www 
  1. Niang-Keita, N.; Ouattara, M. and Saporta, G. Sparse Divisive Feature Clustering. In XXVIII Meeting of the Portuguese Association for Classification and Data Analysis (JOCLAD 2021), pages 75-76, Covilh~a, Portugal, Program and Book of Abstracts , 2021. www 
  1. Saporta, G. From the triumph of black boxes to the right to understand and the search for fairness. In ASMDA 2021, Athens, Greece, 2021. www 
  1. Saporta, G. Interprétabilité des modèles prédictifs. In ASI 11. 11ème Colloque International sur l'Analyse Statistique Implicative, Belfort, France, 2021. www 
  1. Saporta, G. Sparse Correspondence Analysis for Contingency Tables. In Celebrating 40 years of Greek Statistical Institute 1981-2021, Athènes, Greece, 2021. www 

2020

Articles de conférence

  1. Saporta, G. About Interpreting and Explaining Machine Learning and Statistical Models. In SMTDA 2020; 6th Stochastic Modeling Techniques and Data Analysis International Conference, Barcelone (virtual), Spain, 2020. www 

2018

Articles de conférence

  1. Milliet de Faverges, M.; Russolillo, G.; Picouleau, C.; Merabet, B. and Houzel, B. Modelling passenger train arrival delays with Generalized Linear Models and its perspective for scheduling at main stations. In 8th International Conference on Railway Engineering (ICRE 2018), IET, London, United Kingdom, 2018. doi  www 

2017

Articles de conférence

  1. Ghorbanzadeh, D.; Durand, P. and Jaupi, L. Generating the Skew Normal random variable. In World Congress on Engineering 2017, pages 113-116, London-UK, United Kingdom, 2017. www 
  1. Saporta, G. Expliquer ou prédire? Les nouveaux défis. In Chimiometrie 2017, Paris, France, 2017. www 

Softwares and patents

Publications

Ongoing projects

CIFRE VELVET
  • Full name: CIFRE VELVET: CIFRE VELVET - Funder: VELVET CONSULTING
  • Duration: December 2019 - November 2022
  • Description:
EARLY METRICS 2
  • Full name: SOCIETE EARLY METRICS 2: EARLY METRICS 2 - Funder: Société EARLY METRICS
  • Duration: May 2021 - February 2023
  • Description:
CIFRE UTAC 2021-2024
  • Full name: CIFRE UTAC 2021-2024: CIFRE UTAC 2021-2024 - Funder: ANRT
  • Duration: July 2021 - July 2024
  • Description: L'objectif est la recherche de méthodes d’analyse statistique et d’algorithmes d’apprentissage automatique et intelligence artificielle pour la surveillance du contrôle technique automobile.
DJ-PAEEJ 2022
  • Full name: conception et Développement des Jeux Pervasifs Adaptables avec la prise en compte des Etats Emotionnels des Joueurs: DJ-PAEEJ 2022 - Funder: Laboratoire Cédric
  • Duration: January 2022 - December 2022
  • Description: Le projet vise à prendre en considération les états émotionnels des utilisateurs en temps réel pour mieux adapter leurs environnements, leurs interactions... En particulier dans ce projet, ceci est appliqué en milieu pervasif.
Praline 2022
  • Full name: PRivAcy-preserving LocalIzation with MachiNE Learning in IoT: Praline 2022 - Funder: Laboratoire Cédric
  • Duration: January 2022 - December 2022
  • Description: Le projet vise à proposer une solution de localisation des objets connectés tout en assurant la sécurité de cette information en utilisant des algorithmes de machien learning.
MSDMA 2022
  • Full name: Soutien équipe MSDMA 2022: MSDMA 2022 - Funder: Laboratoire Cédric
  • Duration: January 2022 - December 2022
  • Description:

Past projects

    • Full name: Méthodes statistiques, data-mining et apprentissage 2021
    • Duration: December 2020 - December 2021
    • Description:

    • Full name: NEZ ELECTRONIQUE
    • Duration: June 2017 - May 2018
    • Description:

    • Full name: CRM SERVICE 2017-2018
    • Duration: June 2017 - June 2018
    • Description:

    • Full name: PRESIDIO
    • Duration: January 2015 - July 2019
    • Description:

    • Full name: Contrat de prestation de recherche
    • Duration: June 2020 - June 2021
    • Description:

    • Full name: EARLY METRICS
    • Duration: May 2017 - September 2019
    • Description:

    • Full name: IMPACT MDS
    • Duration: November 2019 - October 2021
    • Description:

    • Full name: MEDIATECH Nafise GOUARD
    • Duration: May 2018 - April 2021
    • Description:

Top