Data mining and statistics

The team’s activities concern the processing of data by statistical and computer methods. The unifying concept is that of “data mining”, a discipline that has emerged in recent years at the border of statistics, artificial intelligence and databases and aims to discover relationships and structures in preexisting data.

The theory of learning gives to data mining its conceptual foundations: we traditionally distinguish supervised or unsupervised learning.

Publications

2021

Articles de revue

  1. Huang, T.; Saporta, G.; Wang, H. and Wang, S. A robust spatial autoregressive scalar-on-function regression with t-distribution. In Advances in Data Analysis and Classification, 15 (1): 57-81, 2021. doi  www 

2020

Articles de revue

  1. Wang, H.; Liu, R.; Wang, S.; Wang, Z. and Saporta, G. Ultra-high dimensional variable screening via Gram--Schmidt orthogonalization. In Computational Statistics, 35: 1153-1170, 2020. doi  www 

2019

Articles de revue

  1. Wang, H.; Gu, J.; Wang, S. and Saporta, G. Spatial partial least squares autoregression: Algorithm and applications. In Chemometrics and Intelligent Laboratory Systems, 184: 123-131, 2019. doi  www 
  1. Graffeo, N.; Latouche, A.; Le Tourneau, C. and Chevret, S. ipcwswitch: An R package for inverse probability of censoring weighting with an application to switches in clinical trials. In Computers in Biology and Medicine, 111: 103339, 2019. doi  www 
  1. Austin, P.; Latouche, A. and Fine, J. A review of the use of time-varying covariates in the Fine-Gray subdistribution hazard competing risk regression model. In Statistics in Medicine, 2019. doi  www 
  1. Huang, T.; Wang, H. and Saporta, G. 成分数据的空间自回归模型. In Journal of Beijing University of Aeronautics and Astronautics, 45 (1): 93-98, 2019. doi  www 

Articles de conférence

  1. Ben-younes, H.; Cadene, R.; Thome, N. and Cord, M. BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection. In AAAI 2019 - 33rd AAAI Conference on Artificial Intelligence, Honolulu, United States, 2019. www 

2018

Articles de revue

  1. Wei, Y.; Wang, H.; Wang, S. and Saporta, G. Incremental modelling for compositional data streams. In Communications in Statistics - Simulation and Computation, 48 (8): 2229-2243, 2018. doi  www 
  1. CHEVALIER, M.; Thome, N.; Henaff, G. and Cord, M. Classifying low-resolution images by integrating privileged information in deep CNNs. In Pattern Recognition Letters, 116: 29-35, 2018. doi  www 
  1. Bougeard, S.; Abdi, H.; Saporta, G. and Niang, N. Clusterwise analysis for multiblock component methods. In Advances in Data Analysis and Classification, 12 (2): 285-313, 2018. doi  www 
  1. Bougeard, S.; Cariou, V.; Saporta, G. and Niang, N. Prediction for regularized clusterwise multiblock regression. In Applied Stochastic Models in Business and Industry, 34 (6): 852-867, 2018. doi  www 
  1. Audigier, V.; White, I.; Jolani, S.; Debray, T.; Quartagno, M.; Carpenter, J.; van Buuren, S. and Resche-Rigon, M. Multiple Imputation for Multilevel Data with Continuous and Binary Variables. In Statistical Science, 33 (2): 160-183, 2018. doi  www 

Articles de conférence

  1. Robert, T.; Thome, N. and Cord, M. HybridNet: Classification and Reconstruction Cooperation for Semi-supervised Learning. In ECCV 2018 - 15th European Conference on Computer Vision, pages 158-175, Springer, Munich, Germany, Lecture Notes in Computer Science 11211, 2018. doi  www 

2017

Articles de revue

  1. Liberati, C.; Camillo, F. and Saporta, G. Advances in credit scoring: combining performance and interpretation in kernel discriminant analysis. In Advances in Data Analysis and Classification, 11 (1): 121-138, 2017. doi  www 
  1. Geronimi, J. and Saporta, G. Variable selection for multiply-imputed data with penalized generalized estimating equations. In Computational Statistics and Data Analysis, 110: 103-114, 2017. doi  www 

Articles de conférence

  1. Ben-younes, H.; Cadene, R.; Cord, M. and Thome, N. MUTAN: Multimodal Tucker Fusion for Visual Question Answering. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2631-2639, IEEE, Venice, Italy, 2017 IEEE International Conference on Computer Vision (ICCV) , 2017. doi  www 

Communications

Publications

2020

Articles de conférence

  1. Saporta, G. About Interpreting and Explaining Machine Learning and Statistical Models. In SMTDA 2020; 6th Stochastic Modeling Techniques and Data Analysis International Conference, Barcelone (virtual), Spain, 2020. www 

2018

Articles de conférence

  1. Durand, P.; Ghorbanzadeh, D. and Jaupi, L. Different approaches for the texture classification of a remote sensing image bank. In Ninth International Conference on Graphic and Image Processing, pages 1-9, SPIE, Qingdao, China, 2018. doi  www 
  1. Durand, P.; Ghorbanzadeh, D. and Jaupi, L. Index Theorem and Applications, a Gentle Review. In CMCGS 2018. 7th Annual International Conference on Computational Mathematics, Computational Geometry, pages pp.1-6, Digital Library, Singapore, Singapore, Series Proc. Computational Mathematics Computational Geometry and Statistics (CMCGS) , 2018. doi  www 
  1. Milliet de Faverges, M.; Russolillo, G.; Picouleau, C.; Merabet, B. and Houzel, B. Modelling passenger train arrival delays with Generalized Linear Models and its perspective for scheduling at main stations. In 8th International Conference on Railway Engineering (ICRE 2018), IET, London, United Kingdom, 2018. doi  www 

2017

Articles de conférence

  1. Saporta, G. Expliquer ou prédire? Les nouveaux défis. In Chimiometrie 2017, Paris, France, 2017. www 
  1. Ghorbanzadeh, D.; Durand, P. and Jaupi, L. Generating the Skew Normal random variable. In World Congress on Engineering 2017, pages 113-116, London-UK, United Kingdom, 2017. www 

Softwares and patents

Publications

Ongoing projects

  • Full name: CIFRE VELVET: CIFRE VELVET - Funder: VELVET CONSULTING
  • Duration: December 2019 - November 2022
  • Description:
  • Full name: IMPACT MDS: IMPACT MDS - Funder: MUTUELLE DES SPORTIFS
  • Duration: November 2019 - October 2021
  • Description:
  • Full name: SOCIETE EARLY METRICS 2: EARLY METRICS 2 - Funder: Société EARLY METRICS
  • Duration: May 2021 - February 2023
  • Description:

Past projects

    • Full name: Méthodes statistiques, data-mining et apprentissage
    • Duration: January 2020 - December 2020
    • Description:

    • Full name: NEZ ELECTRONIQUE
    • Duration: June 2017 - May 2018
    • Description:

    • Full name: CRM SERVICE 2017-2018
    • Duration: June 2017 - June 2018
    • Description:

    • Full name: PRESIDIO
    • Duration: January 2015 - July 2019
    • Description:

    • Full name: Contrat de prestation de recherche
    • Duration: June 2020 - June 2021
    • Description:

    • Full name: EARLY METRICS
    • Duration: May 2017 - September 2019
    • Description:

    • Full name: MEDIATECH Nafise GOUARD
    • Duration: May 2018 - April 2021
    • Description:

Top