首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Funding small and medium-sized enterprises (SMEs) to support technological innovation is critical for national competitiveness. Technology credit scoring models are required for the selection of appropriate funding beneficiaries. Typically, a technology credit-scoring model consists of several attributes and new models must be derived every time these attributes are updated. However, it is not feasible to develop new models until sufficient historical evaluation data based on these new attributes will have accumulated. In order to resolve this limitation, we suggest the framework to update the technology credit scoring model. This framework consists of ways to construct new technology credit-scoring model by comparing alternative scenarios for various relationships between existing and new attributes based on explanatory factor analysis, analysis of variance, and logistic regression. Our approach can contribute to find the optimal scenario for updating a scoring model.  相似文献   

2.
The logistic regression framework has been for long time the most used statistical method when assessing customer credit risk. Recently, a more pragmatic approach has been adopted, where the first issue is credit risk prediction, instead of explanation. In this context, several classification techniques have been shown to perform well on credit scoring, such as support vector machines among others. While the investigation of better classifiers is an important research topic, the specific methodology chosen in real world applications has to deal with the challenges arising from the real world data collected in the industry. Such data are often highly unbalanced, part of the information can be missing and some common hypotheses, such as the i.i.d. one, can be violated. In this paper we present a case study based on a sample of IBM Italian customers, which presents all the challenges mentioned above. The main objective is to build and validate robust models, able to handle missing information, class unbalancedness and non-iid data points. We define a missing data imputation method and propose the use of an ensemble classification technique, subagging, particularly suitable for highly unbalanced data, such as credit scoring data. Both the imputation and subagging steps are embedded in a customized cross-validation loop, which handles dependencies between different credit requests. The methodology has been applied using several classifiers (kernel support vector machines, nearest neighbors, decision trees, Adaboost) and their subagged versions. The use of subagging improves the performance of the base classifier and we will show that subagging decision trees achieve better performance, still keeping the model simple and reasonably interpretable.  相似文献   

3.
Applications of log-linear models in discrete discriminant analysis usually treat the grouping variable as a variable in the model. An alternative parameterization is introduced which models the association structure between variables for each population separately. The separate log-linear models may have differing complexity. It is shown that these approaches lead to different classes of models. Applications to the choice of car brand and credit scoring show the usefulness of separate modelling.  相似文献   

4.
With the fast development of financial products and services, bank’s credit departments collected large amounts of data, which risk analysts use to build appropriate credit scoring models to evaluate an applicant’s credit risk accurately. One of these models is the Multi-Criteria Optimization Classifier (MCOC). By finding a trade-off between overlapping of different classes and total distance from input points to the decision boundary, MCOC can derive a decision function from distinct classes of training data and subsequently use this function to predict the class label of an unseen sample. In many real world applications, however, owing to noise, outliers, class imbalance, nonlinearly separable problems and other uncertainties in data, classification quality degenerates rapidly when using MCOC. In this paper, we propose a novel multi-criteria optimization classifier based on kernel, fuzzification, and penalty factors (KFP-MCOC): Firstly a kernel function is used to map input points into a high-dimensional feature space, then an appropriate fuzzy membership function is introduced to MCOC and associated with each data point in the feature space, and the unequal penalty factors are added to the input points of imbalanced classes. Thus, the effects of the aforementioned problems are reduced. Our experimental results of credit risk evaluation and their comparison with MCOC, support vector machines (SVM) and fuzzy SVM show that KFP-MCOC can enhance the separation of different applicants, the efficiency of credit risk scoring, and the generalization of predicting the credit rank of a new credit applicant.  相似文献   

5.
In developing a classification model for assigning observations of unknown class to one of a number of specified classes using the values of a set of features associated with each observation, it is often desirable to base the classifier on a limited number of features. Mathematical programming discriminant analysis methods for developing classification models can be extended for feature selection. Classification accuracy can be used as the feature selection criterion by using a mixed integer programming (MIP) model in which a binary variable is associated with each training sample observation, but the binary variable requirements limit the size of problems to which this approach can be applied. Heuristic feature selection methods for problems with large numbers of observations are developed in this paper. These heuristic procedures, which are based on the MIP model for maximizing classification accuracy, are then applied to three credit scoring data sets.  相似文献   

6.
7.
Consumer credit scoring is one of the most successful applications of quantitative analysis in business with nearly every major lender using charge-off models to make decisions. Yet banks do not extend credit to control charge-off, but to secure profit. So, while charge-off models work well in rank-ordering the loan default costs associated with lending and are ubiquitous throughout the industry, the equivalent models on the revenue side are not being used despite the need. This paper outlines a profit-based scoring system for credit cards to be used for acquisition decisions by addressing three issues. First, the paper explains why credit card profit models—as opposed to cost or charge-off models—have been difficult to build and implement. Second, a methodology for modelling revenue on credit cards at application is proposed. Finally, acquisition strategies are explored that use both a spend model and a charge-off model to balance tradeoffs between charge-off, revenue, and volume.  相似文献   

8.
信用分类是信用风险管理中一个重要环节,其主要目的是根据信用申请客户提供的资料从申请客户中区分出可信客户和违约客户,以便为信用决策者提供决策依据.为了正确区分不同的信用客户,特别是违约客户,结合核主元分析和支持向量机算法构造基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型对信用数据进行了分类处理.在基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型中,首先对样本数据进行预处理,然后利用核主元分析以非线性方式降低数据的维数,最后利用带可变惩罚因子最小二乘模糊支持向量机模型对降维后数据进行分类分析.为了验证,选择两个公开的信用数据集来进行实证分析.实证结果表明:基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型取得了较好的分类结果,可为信用决策者提供重要的决策参考依据.  相似文献   

9.
Data-based scorecards, such as those used in credit scoring, age with time and need to be rebuilt or readjusted. Unlike the huge literature on modelling the replacement and maintenance of equipment there have been hardly any models that deal with this problem for scorecards. This paper identifies an effective way of describing the predictive ability of the scorecard and from this describes a simple model for how its predictive ability will develop. Using a dynamic programming approach one is then able to find when it is optimal to rebuild and when to readjust a scorecard. Failing to readjust or rebuild a scorecard when they aged was one of the defects in credit scoring identified in the investigations into the sub-prime mortgage crisis.  相似文献   

10.
Due to the recent financial turmoil, a discussion in the banking sector about how to accomplish long term success, and how to follow an exhaustive and powerful strategy in credit scoring is being raised up. Recently, the significant theoretical advances in machine learning algorithms have pushed the application of kernel-based classifiers, producing very effective results. Unfortunately, such tools have an inability to provide an explanation, or comprehensible justification, for the solutions they supply. In this paper, we propose a new strategy to model credit scoring data, which exploits, indirectly, the classification power of the kernel machines into an operative field. A reconstruction process of the kernel classifier is performed via linear regression, if all predictors are numerical, or via a general linear model, if some or all predictors are categorical. The loss of performance, due to such approximation, is balanced by better interpretability for the end user, which is able to order, understand and to rank the influence of each category of the variables set in the prediction. An Italian bank case study has been illustrated and discussed; empirical results reveal a promising performance of the introduced strategy.  相似文献   

11.
The number of Non-Performing Loans has increased in recent years, paralleling the current financial crisis, thus increasing the importance of credit scoring models. This study proposes a three stage hybrid Adaptive Neuro Fuzzy Inference System credit scoring model, which is based on statistical techniques and Neuro Fuzzy. The proposed model’s performance was compared with conventional and commonly utilized models. The credit scoring models are tested using a 10-fold cross-validation process with the credit card data of an international bank operating in Turkey. Results demonstrate that the proposed model consistently performs better than the Linear Discriminant Analysis, Logistic Regression Analysis, and Artificial Neural Network (ANN) approaches, in terms of average correct classification rate and estimated misclassification cost. As with ANN, the proposed model has learning ability; unlike ANN, the model does not stay in a black box. In the proposed model, the interpretation of independent variables may provide valuable information for bankers and consumers, especially in the explanation of why credit applications are rejected.  相似文献   

12.
Credit scoring is one of the most widely used applications of quantitative analysis in business. Behavioural scoring is a type of credit scoring that is performed on existing customers to assist lenders in decisions like increasing the balance or promoting new products. This paper shows how using survival analysis tools from reliability and maintenance modelling, specifically Cox's proportional hazards regression, allows one to build behavioural scoring models. Their performance is compared with that of logistic regression. Also the advantages of using survival analysis techniques in building scorecards are illustrated by estimating the expected profit from personal loans. This cannot be done using the existing risk behavioural systems.  相似文献   

13.
Many machine learning based algorithms contain a training step that is done once. The training step is usually computational expensive since it involves processing of huge matrices. If the training profile is extracted from an evolving dynamic dataset, it has to be updated as some features of the training dataset are changed. This paper proposes a solution how to update this profile efficiently. Therefore, we investigate how to update the training profile when the data is constantly evolving. We assume that the data is modeled by a kernel method and processed by a spectral decomposition. In many algorithms for clustering and classification, a low dimensional representation of the affinity (kernel) graph of the embedded training dataset is computed. Then, it is used for classifying newly arrived data points. We present methods for updating such embeddings of the training datasets in an incremental way without the need to perform the entire computation upon the occurrences of changes in a small number of the training samples. Efficient computation of such an algorithm is critical in many web based applications.  相似文献   

14.
Consumer finance has become one of the most important areas of banking, both because of the amount of money being lent and the impact of such credit on global economy and the realisation that the credit crunch of 2008 was partly due to incorrect modelling of the risks in such lending. This paper reviews the development of credit scoring—the way of assessing risk in consumer finance—and what is meant by a credit score. It then outlines 10 challenges for Operational Research to support modelling in consumer finance. Some of these involve developing more robust risk assessment systems, whereas others are to expand the use of such modelling to deal with the current objectives of lenders and the new decisions they have to make in consumer finance.  相似文献   

15.
客户信用评估是银行等金融企业日常经营活动中的重要组成部分。一般违约样本在客户总体中只占少数,而能按时还款客户样本占多数,这就是客户信用评估中常见的类别不平衡问题。目前,用于客户信用评估的方法尚不能有效解决少数类样本稀缺带来的类别不平衡。本研究引入迁移学习技术整合系统内外部信息,以解决少数类样本稀缺带来的类别不平衡问题。为了提高对来自系统外部少数类样本信息的使用效率,构建了一种新的迁移学习模型:以基于集成技术的迁移装袋模型为基础,使用两阶段抽样和数据分组处理技术分别对其基模型生成和集成策略进行改进。运用重庆某商业银行信用卡客户数据进行的实证研究结果表明:与目前客户信用评估的常用方法相比,新模型能更好地处理绝对稀缺条件下类别不平衡对客户信用评估的影响,特别对占少数的违约客户有更好的预测精度。  相似文献   

16.
Received on 1 July 1991. The benefit to consumers from the use of informative creditreports is demonstrated by showing the improvement in creditdecisions when generic scoring models based on credit reportsare implemented. If these models are highly predictive, thenthe truncation of credit reports will reduce the predictivepower of bureau-based generic scoring systems. As a result,more good credit risks will be denied credit, and more poorcredit risks will be granted credit. It is shown that, evenwhen applied to credit applications that had already been screenedand approved, the use of generic scoring models significantlyimproves credit grantors' ability to predict and eliminate bankruptcies,charge-offs, and delinquencies. As applied to existing accounts,bureau-based generic scores are shown to have predictive valuefor at least 3 months, while scores 12 months old may not bevery powerful. Even though bureau-based scores shift towardsthe high-risk end of the distribution during a recession, theycontinue to rank risk very well. When coupled with application-basedcredit-scoring models, scores based on credit-bureau data furtherimprove the predictive power of the model-the improvements beinggreater with more complete bureau information. We conclude thatgovernment-imposed limits on credit information are anti-consumerby fostering more errors in credit decisions.  相似文献   

17.
为了充分利用SVM在个人信用评估方面的优点、克服其不足,提出了基于支持向量机委员会机器的个人信用评估模型.将模型与基于属性效用函数估计构造新学习样本方法结合起来进行个人信用评估;经实证分析及与SVM方法对比发现,模型具有更好、更快、更多适应性的预测分类能力.  相似文献   

18.
The features used may have an important effect on the performance of credit scoring models. The process of choosing the best set of features for credit scoring models is usually unsystematic and dominated by somewhat arbitrary trial. This paper presents an empirical study of four machine learning feature selection methods. These methods provide an automatic data mining technique for reducing the feature space. The study illustrates how four feature selection methods—‘ReliefF’, ‘Correlation-based’, ‘Consistency-based’ and ‘Wrapper’ algorithms help to improve three aspects of the performance of scoring models: model simplicity, model speed and model accuracy. The experiments are conducted on real data sets using four classification algorithms—‘model tree (M5)’, ‘neural network (multi-layer perceptron with back-propagation)’, ‘logistic regression’, and ‘k-nearest-neighbours’.  相似文献   

19.
在海量征信数据的背景下,为降低缺失数据插补的计算成本,提出收缩近邻插补方法.收缩近邻方法通过三阶段完成数据插补,第一阶段基于样本和变量的缺失比例计算入样概率,通过不等概抽样完成数据的收缩,第二阶段基于样本间距离,选取与缺失样本近邻的样本组成训练集,第三阶段建立随机森林模型进行迭代插补.利用Australian数据集和中国各银行数据集进行模拟研究,结果表明在确保一定插补精度的情况下,收缩近邻方法较大程度减少了计算量.  相似文献   

20.
Dataset shift is present in almost all real-world applications, since most of them are constantly dealing with changing environments. Detecting fractures in datasets on time allows recalibrating the models before a significant decrease in the model’s performance is observed. Since small changes are normal in most applications and do not justify the efforts that a model recalibration requires, we are only interested in identifying those changes that are critical for the correct functioning of the model. In this work we propose a model-dependent backtesting strategy designed to identify significant changes in the covariates, relating a confidence zone of the change to a maximal deviance measure obtained from the coefficients of the model. Using logistic regression as a predictive approach, we performed experiments on simulated data, and on a real-world credit scoring dataset. The results show that the proposed method has better performance than traditional approaches, consistently identifying major changes in variables while taking into account important characteristics of the problem, such as sample sizes and variances, and uncertainty in the coefficients.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号