期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Giuseppe Paleologo André Elisseeff Gianluca Antonini 《European Journal of Operational Research》2010

The logistic regression framework has been for long time the most used statistical method when assessing customer credit risk. Recently, a more pragmatic approach has been adopted, where the first issue is credit risk prediction, instead of explanation. In this context, several classification techniques have been shown to perform well on credit scoring, such as support vector machines among others. While the investigation of better classifiers is an important research topic, the specific methodology chosen in real world applications has to deal with the challenges arising from the real world data collected in the industry. Such data are often highly unbalanced, part of the information can be missing and some common hypotheses, such as the i.i.d. one, can be violated. In this paper we present a case study based on a sample of IBM Italian customers, which presents all the challenges mentioned above. The main objective is to build and validate robust models, able to handle missing information, class unbalancedness and non-iid data points. We define a missing data imputation method and propose the use of an ensemble classification technique, subagging, particularly suitable for highly unbalanced data, such as credit scoring data. Both the imputation and subagging steps are embedded in a customized cross-validation loop, which handles dependencies between different credit requests. The methodology has been applied using several classifiers (kernel support vector machines, nearest neighbors, decision trees, Adaboost) and their subagged versions. The use of subagging improves the performance of the base classifier and we will show that subagging decision trees achieve better performance, still keeping the model simple and reasonably interpretable. 相似文献

2.

Sample selection bias in credit scoring models

J Banasik J Crook L Thomas 《The Journal of the Operational Research Society》2003,54(8):822-832

One of the aims of credit scoring models is to predict the probability of repayment of any applicant and yet such models are usually parameterised using a sample of accepted applicants only. This may lead to biased estimates of the parameters. In this paper we examine two issues. First, we compare the classification accuracy of a model based only on accepted applicants, relative to one based on a sample of all applicants. We find only a minimal difference, given the cutoff scores for the old model used by the data supplier. Using a simulated model we examine the predictive performance of models estimated from bands of applicants, ranked by predicted creditworthiness. We find that the lower the risk band of the training sample, the less accurate the predictions for all applicants. We also find that the lower the risk band of the training sample, the greater the overestimate of the true performance of the model, when tested on a sample of applicants within the same risk band — as a financial institution would do. The overestimation may be very large. Second, we examine the predictive accuracy of a bivariate probit model with selection (BVP). This parameterises the accept–reject model allowing for (unknown) omitted variables to be correlated with those of the original good–bad model. The BVP model may improve accuracy if the loan officer has overridden a scoring rule. We find that a small improvement when using the BVP model is sometimes possible. 相似文献

3.

Data mining feature selection for credit scoring models

Y Liu M Schumann 《The Journal of the Operational Research Society》2005,56(9):1099-1108

The features used may have an important effect on the performance of credit scoring models. The process of choosing the best set of features for credit scoring models is usually unsystematic and dominated by somewhat arbitrary trial. This paper presents an empirical study of four machine learning feature selection methods. These methods provide an automatic data mining technique for reducing the feature space. The study illustrates how four feature selection methods—‘ReliefF’, ‘Correlation-based’, ‘Consistency-based’ and ‘Wrapper’ algorithms help to improve three aspects of the performance of scoring models: model simplicity, model speed and model accuracy. The experiments are conducted on real data sets using four classification algorithms—‘model tree (M5)’, ‘neural network (multi-layer perceptron with back-propagation)’, ‘logistic regression’, and ‘k-nearest-neighbours’. 相似文献

4.

Mixture cure models in credit scoring: If and when borrowers default

Edward N.C. Tong Christophe Mues Lyn C. Thomas 《European Journal of Operational Research》2012,218(1):132-139

Mixture cure models were originally proposed in medical statistics to model long-term survival of cancer patients in terms of two distinct subpopulations - those that are cured of the event of interest and will never relapse, along with those that are uncured and are susceptible to the event. In the present paper, we introduce mixture cure models to the area of credit scoring, where, similarly to the medical setting, a large proportion of the dataset may not experience the event of interest during the loan term, i.e. default. We estimate a mixture cure model predicting (time to) default on a UK personal loan portfolio, and compare its performance to the Cox proportional hazards method and standard logistic regression. Results for credit scoring at an account level and prediction of the number of defaults at a portfolio level are presented; model performance is evaluated through cross validation on discrimination and calibration measures. Discrimination performance for all three approaches was found to be high and competitive. Calibration performance for the survival approaches was found to be superior to logistic regression for intermediate time intervals and useful for fixed 12 month time horizon estimates, reinforcing the flexibility of survival analysis as both a risk ranking tool and for providing robust estimates of probability of default over time. Furthermore, the mixture cure model’s ability to distinguish between two subpopulations can offer additional insights by estimating the parameters that determine susceptibility to default in addition to parameters that influence time to default of a borrower. 相似文献

5.

Comprehensible credit scoring models using rule extraction from support vector machines

David Martens Bart Baesens Tony Van Gestel Jan Vanthienen 《European Journal of Operational Research》2007

In recent years, support vector machines (SVMs) were successfully applied to a wide range of applications. However, since the classifier is described as a complex mathematical function, it is rather incomprehensible for humans. This opacity property prevents them from being used in many real-life applications where both accuracy and comprehensibility are required, such as medical diagnosis and credit risk evaluation. To overcome this limitation, rules can be extracted from the trained SVM that are interpretable by humans and keep as much of the accuracy of the SVM as possible. In this paper, we will provide an overview of the recently proposed rule extraction techniques for SVMs and introduce two others taken from the artificial neural networks domain, being Trepan and G-REX. The described techniques are compared using publicly available datasets, such as Ripley’s synthetic dataset and the multi-class iris dataset. We will also look at medical diagnosis and credit scoring where comprehensibility is a key requirement and even a regulatory recommendation. Our experiments show that the SVM rule extraction techniques lose only a small percentage in performance compared to SVMs and therefore rank at the top of comprehensible classification techniques. 相似文献

6.

The benefit to consumers from generic scoring models based on credit reports

CHANDLER GARY G.; JOHNSON ROBERT W. 《IMA Journal of Management Mathematics》1992,4(1):61-72

Received on 1 July 1991. The benefit to consumers from the use of informative creditreports is demonstrated by showing the improvement in creditdecisions when generic scoring models based on credit reportsare implemented. If these models are highly predictive, thenthe truncation of credit reports will reduce the predictivepower of bureau-based generic scoring systems. As a result,more good credit risks will be denied credit, and more poorcredit risks will be granted credit. It is shown that, evenwhen applied to credit applications that had already been screenedand approved, the use of generic scoring models significantlyimproves credit grantors' ability to predict and eliminate bankruptcies,charge-offs, and delinquencies. As applied to existing accounts,bureau-based generic scores are shown to have predictive valuefor at least 3 months, while scores 12 months old may not bevery powerful. Even though bureau-based scores shift towardsthe high-risk end of the distribution during a recession, theycontinue to rank risk very well. When coupled with application-basedcredit-scoring models, scores based on credit-bureau data furtherimprove the predictive power of the model-the improvements beinggreater with more complete bureau information. We conclude thatgovernment-imposed limits on credit information are anti-consumerby fostering more errors in credit decisions. 相似文献

7.

Development and application of consumer credit scoring models using profit-based classification measures

Thomas Verbraken Cristián Bravo Richard Weber Bart Baesens 《European Journal of Operational Research》2014

This paper presents a new approach for consumer credit scoring, by tailoring a profit-based classification performance measure to credit risk modeling. This performance measure takes into account the expected profits and losses of credit granting and thereby better aligns the model developers’ objectives with those of the lending company. It is based on the Expected Maximum Profit (EMP) measure and is used to find a trade-off between the expected losses – driven by the exposure of the loan and the loss given default – and the operational income given by the loan. Additionally, one of the major advantages of using the proposed measure is that it permits to calculate the optimal cutoff value, which is necessary for model implementation. To test the proposed approach, we use a dataset of loans granted by a government institution, and benchmarked the accuracy and monetary gain of using EMP, accuracy, and the area under the ROC curve as measures for selecting model parameters, and for determining the respective cutoff values. The results show that our proposed profit-based classification measure outperforms the alternative approaches in terms of both accuracy and monetary value in the test set, and that it facilitates model deployment. 相似文献

8.

A comparison of neural networks and linear scoring models in the credit union environment

《European Journal of Operational Research》1996,95(1):24-37

The purpose of the present paper is to explore the ability of neural networks such as multilayer perceptrons and modular neural networks, and traditional techniques such as linear discriminant analysis and logistic regression, in building credit scoring models in the credit union environment. Also, since funding and small sample size often preclude the use of customized credit scoring models at small credit unions, we investigate the performance of generic models and compare them with customized models. Our results indicate that customized neural networks offer a very promising avenue if the measure of performance is percentage of bad loans correctly classified. However, if the measure of performance is percentage of good and bad loans correctly classified, logistic regression models are comparable to the neural networks approach. The performance of generic models was not as good as the customized models, particularly when it came to correctly classifying bad loans. Although we found significant differences in the results for the three credit unions, our modular neural network could not accommodate these differences, indicating that more innovative architectures might be necessary for building effective generic models. 相似文献

9.

Direct versus indirect credit scoring classifications

H G Li D J Hand 《The Journal of the Operational Research Society》2002,53(6):647-654

We introduce a new approach to assigning bank account holders to ‘good’ or ‘bad’ classes based on their future behaviour. Traditional methods simply treat the classes as qualitatively distinct, and seek to predict them directly, using statistical techniques such as logistic regression or discriminant analysis based on application data or observations of previous behaviour. We note, however, that the ‘good’ and ‘bad’ classes are defined in terms of variables such as the amount overdrawn at the time at which the classification is required. This permits an alternative, ‘indirect’, form of classification model in which, first, the variables defining the classes are predicted, for example using regression, and then the class membership is derived deterministically from these predicted values. We compare traditional direct methods with these new indirect methods using both real bank data and simulated data. The new methods appear to perform very similarly to the traditional methods, and we discuss why this might be. Finally, we note that the indirect methods also have certain other advantages over the traditional direct methods. 相似文献

10.

Bayesian networks applied to credit scoring

CHANG K. C.; FUNG ROBERT; LUCAS ALAN; OLIVER ROBERT; SHIKALOFF NINA 《IMA Journal of Management Mathematics》2000,11(1):1-18

Email: kchang{at}gmu.eduEmail: RobertFung{at}Fairlsaac.comEmail: alan.lucas{at}hotmail.com^¶Email: BobOliver{at}Fairlsaac.com^||Email: NShikaloff{at}Fairlsaac.com The objectives of this paper are to apply the theory and numericalalgorithms of Bayesian networks to risk scoring, and comparethe results with traditional methods for computing scores andposterior predictions of performance variables. Model identification,inference, and prediction of random variables using Bayesiannetworks have been successfully applied in a number of areas,including medical diagnosis, equipment failure, informationretrieval, rare-event prediction, and pattern recognition. Theability to graphically represent conditional dependencies andindependencies among random variables may also be useful incredit scoring. Although several papers have already appearedin the literature which use graphical models for model identification,as far as we know there have been no explicit experimental resultsthat compare a traditionally computed risk score with predictionsbased on Bayesian learning algorithms. In this paper, we examine a database of credit-card applicantsand attempt to ‘learn’ the graphical structure ofthe characteristics or variables that make up the database.We identify representative Bayesian networks in a developmentsample as well as the associated Markov blankets and cliquestructures within the Markov blanket. Once we obtain the structureof the underlying conditional independencies, we are able toestimate the probabilities of each node conditional on its directpredecessor node(s). We then calculate the posterior probabilitiesand scores of a performance variable for the development sample.Finally, we calculate the receiver operating characteristic(ROC) curves and relative profitability of scorecards basedon these identifications. The results of the different modelsand methods are compared with both development and validationsamples. Finally, we report on a statistical entropy calculationthat measures the degree to which cliques identified in theBayesian network are independent of one another. 相似文献

11.

Using semi-supervised classifiers for credit scoring

K Kennedy B Mac Namee S J Delany 《The Journal of the Operational Research Society》2013,64(4):513-529

In credit scoring, low-default portfolios (LDPs) are those for which very little default history exists. This makes it problematic for financial institutions to estimate a reliable probability of a customer defaulting on a loan. Banking regulation (Basel II Capital Accord), and best practice, however, necessitate an accurate and valid estimate of the probability of default. In this article the suitability of semi-supervised one-class classification (OCC) algorithms as a solution to the LDP problem is evaluated. The performance of OCC algorithms is compared with the performance of supervised two-class classification algorithms. This study also investigates the suitability of over sampling, which is a common approach to dealing with LDPs. Assessment of the performance of one- and two-class classification algorithms using nine real-world banking data sets, which have been modified to replicate LDPs, is provided. Our results demonstrate that only in the near or complete absence of defaulters should semi-supervised OCC algorithms be used instead of supervised two-class classification algorithms. Furthermore, we demonstrate for data sets whose class labels are unevenly distributed that optimising the threshold value on classifier output yields, in many cases, an improvement in classification performance. Finally, our results suggest that oversampling produces no overall improvement to the best performing two-class classification algorithms. 相似文献

12.

Improving credit scoring by differentiating defaulter behaviour

Cristián Bravo Lyn C Thomas Richard Weber 《The Journal of the Operational Research Society》2015,66(5):771-781

We present a methodology for improving credit scoring models by distinguishing two forms of rational behaviour of loan defaulters. It is common knowledge among practitioners that there are two types of defaulters, those who do not pay because of cash flow problems (‘Can’t Pay’), and those that do not pay because of lack of willingness to pay (‘Won’t Pay’). This work proposes to differentiate them using a game theory model that describes their behaviour. This separation of behaviours is represented by a set of constraints that form part of a semi-supervised constrained clustering algorithm, constructing a new target variable summarizing relevant future information. Within this approach the results of several supervised models are benchmarked, in which the models deliver the probability of belonging to one of these three new classes (good payers, ‘Can’t Pays’, and ‘Won’t Pays’). The process improves classification accuracy significantly, and delivers strong insights regarding the behaviour of defaulters. 相似文献

13.

Adaptive credit scoring with kernel learning methods

Yingxu Yang 《European Journal of Operational Research》2007

Credit scoring is a method of modelling potential risk of credit applications. Traditionally, logistic regression and discriminant analysis are the most widely used approaches to create scoring models in the industry. However, these methods are associated with quite a few limitations, such as being instable with high-dimensional data and small sample size, intensive variable selection effort and incapability of efficiently handling non-linear features. Most importantly, based on these algorithms, it is difficult to automate the modelling process and when population changes occur, the static models usually fail to adapt and may need to be rebuilt from scratch. In the last few years, the kernel learning approach has been investigated to solve these problems. However, the existing applications of this type of methods (in particular the SVM) in credit scoring have all focused on the batch model and did not address the important problem of how to update the scoring model on-line. This paper presents a novel and practical adaptive scoring system based on an incremental kernel method. With this approach, the scoring model is adjusted according to an on-line update procedure that can always converge to the optimal solution without information loss or running into numerical difficulties. Non-linear features in the data are automatically included in the model through a kernel transformation. This approach does not require any variable reduction effort and is also robust for scoring data with a large number of attributes and highly unbalanced class distributions. Moreover, a new potential kernel function is introduced to further improve the predictive performance of the scoring model and a kernel attribute ranking technique is used that adds transparency in the final model. Experimental studies using real world data sets have demonstrated the effectiveness of the proposed method. 相似文献

14.

Benchmarking state-of-the-art classification algorithms for credit scoring

B Baesens T Van Gestel S Viaene M Stepanova J Suykens J Vanthienen 《The Journal of the Operational Research Society》2003,54(6):627-635

In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring. 相似文献

15.

A profit-based scoring system in consumer credit: making acquisition decisions for credit cards

R T Stewart 《The Journal of the Operational Research Society》2011,62(9):1719-1725

Consumer credit scoring is one of the most successful applications of quantitative analysis in business with nearly every major lender using charge-off models to make decisions. Yet banks do not extend credit to control charge-off, but to secure profit. So, while charge-off models work well in rank-ordering the loan default costs associated with lending and are ubiquitous throughout the industry, the equivalent models on the revenue side are not being used despite the need. This paper outlines a profit-based scoring system for credit cards to be used for acquisition decisions by addressing three issues. First, the paper explains why credit card profit models—as opposed to cost or charge-off models—have been difficult to build and implement. Second, a methodology for modelling revenue on credit cards at application is proposed. Finally, acquisition strategies are explored that use both a spend model and a charge-off model to balance tradeoffs between charge-off, revenue, and volume. 相似文献

16.

Structural models in consumer credit

Fabio Wendling Muniz de Andrade Lyn Thomas 《European Journal of Operational Research》2007

We propose a structural credit risk model for consumer lending using option theory and the concept of the value of the consumer’s reputation. Using Brazilian empirical data and a credit bureau score as proxy for creditworthiness we compare a number of alternative models before suggesting one that leads to a simple analytical solution for the probability of default. We apply the proposed model to portfolios of consumer loans introducing a factor to account for the mean influence of systemic economic factors on individuals. This results in a hybrid structural-reduced-form model. And comparisons are made with the Basel II approach. Our conclusions partially support that approach for modelling the credit risk of portfolios of retail credit. 相似文献

17.

Interpretable machine learning for imbalanced credit scoring datasets

《European Journal of Operational Research》2024,312(1):357-372

The class imbalance problem is common in the credit scoring domain, as the number of defaulters is usually much less than the number of non-defaulters. To date, research on investigating the class imbalance problem has mainly focused on indicating and reducing the adverse effect of the class imbalance on the predictive accuracy of machine learning techniques, while the impact of that on machine learning interpretability has never been studied in the literature. This paper fills this gap by analysing how the stability of Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP), two popular interpretation methods, are affected by class imbalance. Our experiments use 2016–2020 UK residential mortgage data collected from European Datawarehouse. We evaluate the stability of LIME and SHAP on datasets of progressively increased class imbalance. The results show that interpretations generated from LIME and SHAP are less stable as the class imbalance increases, which indicates that the class imbalance does have an adverse effect on machine learning interpretability. To check the robustness of our outcomes, we also analyse two open-source credit scoring datasets and we obtain similar results. 相似文献

18.

Measuring retail company performance using credit scoring techniques

Yu-Chiang Hu Jake Ansell 《European Journal of Operational Research》2007

This paper discusses models for evaluating credit risk in relation to the retailing industry. Hunt’s [Hunt, S.D., 2000. A General Theory of Competition. Sage Publications Inc., California] Resource–Advantage Theory of Competition is used as a basis for variable selection, given the theory’s relevancy to retail competition. The study focuses on the US retail market. Four standard credit scoring methodologies: Naïve Bayes, Logistic Regression, Recursive Partitioning and Artificial Neural Network, are compared with Sequential Minimal Optimization (SMO), using a sample of 195 healthy companies and 51 distressed firms over five time periods from 1994 to 2002. 相似文献

19.

Bound and collapse Bayesian reject inference for credit scoring

G G Chen T Åstebro 《The Journal of the Operational Research Society》2012,63(10):1374-1387

Reject inference is a method for inferring how a rejected credit applicant would have behaved had credit been granted. Credit-quality data on rejected applicants are usually missing not at random (MNAR). In order to infer credit-quality data MNAR, we propose a flexible method to generate the probability of missingness within a model-based bound and collapse Bayesian technique. We tested the method's performance relative to traditional reject-inference methods using real data. Results show that our method improves the classification power of credit scoring models under MNAR conditions. 相似文献

20.

Enhanced decision support in credit scoring using Bayesian binary quantile regression

V L Miguéis D F Benoit D Van den Poel 《The Journal of the Operational Research Society》2013,64(9):1374-1383

Fierce competition as well as the recent financial crisis in financial and banking industries made credit scoring gain importance. An accurate estimation of credit risk helps organizations to decide whether or not to grant credit to potential customers. Many classification methods have been suggested to handle this problem in the literature. This paper proposes a model for evaluating credit risk based on binary quantile regression, using Bayesian estimation. This paper points out the distinct advantages of the latter approach: that is (i) the method provides accurate predictions of which customers may default in the future, (ii) the approach provides detailed insight into the effects of the explanatory variables on the probability of default, and (iii) the methodology is ideally suited to build a segmentation scheme of the customers in terms of risk of default and the corresponding uncertainty about the prediction. An often studied dataset from a German bank is used to show the applicability of the method proposed. The results demonstrate that the methodology can be an important tool for credit companies that want to take the credit risk of their customer fully into account. 相似文献