[SAP06a] Statistical Methods and Credit Scoring

Conférences invitées : JOCLAD'06, Lisbonne, January 2006,

Auteurs: G. Saporta

motcle:

Résumé: Basel 2 regulations brought new interest in supervised classification methodologies for predicting default probability for loans. Density estimation , neural networks, non linear SVM provide direct estimates of default probability but are not widely used because of the lack of interpretability. Logistic regression and linear discriminant analysis are the most frequently used techniques for they provide easy-to-use scorecards based on additive partial scores. We will compare these two major techniques, which are sometimes unduly opposed. Since posterior probabilities depend on priors, we will address the case of stratified sampling. An important feature of consumer credit is that predictors are generally categorical. Vapnik's statistical learning theory explains why a prior dimension reduction (eg by means of multiple correspondence analysis) improves the robustness of the score function. Default probabilities may be computed directly, or by means of a score function. Since a probability is also a score, almost all classification methods (including classification trees), may be compared thanks to ROC analysis, which is more informative than the simple misclassification rate. Survival analysis brings new perspectives, especially for long-term loans, for the prediction of "when" instead of "if" a default occurrs.

BibTeX

@inproceedings {
	SAP06a,
	title	=	"{Statistical Methods and Credit Scoring}",
	author	=	" G. Saporta ",
	booktitle	=	"{JOCLAD'06, Lisbonne}",
	year	=	2006,
	month	=	"January",
	note	=	"{XIII Ã¨mes congrÃ¨s de la SocietÃ© Portugaise de Classification et d'Analyse des DonnÃ©es}",
}