TY - UNPB
T1 - Credit Scoring
T2 - Discussion of Methods and a Case Study
AU - Kronborg, Dorte
AU - Tjur, Tue
AU - Vincents, Bo
PY - 1999
Y1 - 1999
N2 - The scenario considered is that of a credit association, a bank or another financial institution which, on the basis of information about a new potential customer and historical data on many other customers, has to decide whether or not to give that customer a certain loan. We discuss three popular techniques: logistic regression, discriminant analysis and neural networks. We shall argue strongly in favour of the logistic regression. Discriminant analysis can be used, and for reasons that can be explained mathematically it will often result in approximately the same conclusions as a logistic regression. But the statistical assumptions are not appropriate in most cases, and the results given are not as directly interpretable as those of logistic regression. Neural network techniques, in their simplest form, suffer from the lack of statistical standard methods for verification of the model and tests for removal of covariates. This problem disappears to some extend when the neural networks are reformulated as proper statistical models, based on the type of functions that are considered in neural networks. But this results in a somewhat specialized class of non{linear regression models, which may be useful in situations where local peculiarities of the response function are in focus, but certainly not when the overall - usually monotone - effect of many more or less confounded covariates is the issue. We discuss, within the logistic regression framework, the handling of phenomena such as time trends and corruption of the historical data due to shifts of policy, censoring and/or interventions in highrisk customers' economy. Finally, we illustrate and support the theoretical considerations by a case study concerning mortgage loans in a Danish credit association.
AB - The scenario considered is that of a credit association, a bank or another financial institution which, on the basis of information about a new potential customer and historical data on many other customers, has to decide whether or not to give that customer a certain loan. We discuss three popular techniques: logistic regression, discriminant analysis and neural networks. We shall argue strongly in favour of the logistic regression. Discriminant analysis can be used, and for reasons that can be explained mathematically it will often result in approximately the same conclusions as a logistic regression. But the statistical assumptions are not appropriate in most cases, and the results given are not as directly interpretable as those of logistic regression. Neural network techniques, in their simplest form, suffer from the lack of statistical standard methods for verification of the model and tests for removal of covariates. This problem disappears to some extend when the neural networks are reformulated as proper statistical models, based on the type of functions that are considered in neural networks. But this results in a somewhat specialized class of non{linear regression models, which may be useful in situations where local peculiarities of the response function are in focus, but certainly not when the overall - usually monotone - effect of many more or less confounded covariates is the issue. We discuss, within the logistic regression framework, the handling of phenomena such as time trends and corruption of the historical data due to shifts of policy, censoring and/or interventions in highrisk customers' economy. Finally, we illustrate and support the theoretical considerations by a case study concerning mortgage loans in a Danish credit association.
KW - Credit scoring
KW - Discriminant analysis
KW - Logistic regression
KW - Neural networks
KW - Event history analysis
M3 - Working paper
T3 - Preprint
BT - Credit Scoring
PB - Center for Statistics
CY - Frederiksberg
ER -