首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
With the fast development of financial products and services, bank’s credit departments collected large amounts of data, which risk analysts use to build appropriate credit scoring models to evaluate an applicant’s credit risk accurately. One of these models is the Multi-Criteria Optimization Classifier (MCOC). By finding a trade-off between overlapping of different classes and total distance from input points to the decision boundary, MCOC can derive a decision function from distinct classes of training data and subsequently use this function to predict the class label of an unseen sample. In many real world applications, however, owing to noise, outliers, class imbalance, nonlinearly separable problems and other uncertainties in data, classification quality degenerates rapidly when using MCOC. In this paper, we propose a novel multi-criteria optimization classifier based on kernel, fuzzification, and penalty factors (KFP-MCOC): Firstly a kernel function is used to map input points into a high-dimensional feature space, then an appropriate fuzzy membership function is introduced to MCOC and associated with each data point in the feature space, and the unequal penalty factors are added to the input points of imbalanced classes. Thus, the effects of the aforementioned problems are reduced. Our experimental results of credit risk evaluation and their comparison with MCOC, support vector machines (SVM) and fuzzy SVM show that KFP-MCOC can enhance the separation of different applicants, the efficiency of credit risk scoring, and the generalization of predicting the credit rank of a new credit applicant.  相似文献   

2.
Reject inference is a method for inferring how a rejected credit applicant would have behaved had credit been granted. Credit-quality data on rejected applicants are usually missing not at random (MNAR). In order to infer credit-quality data MNAR, we propose a flexible method to generate the probability of missingness within a model-based bound and collapse Bayesian technique. We tested the method's performance relative to traditional reject-inference methods using real data. Results show that our method improves the classification power of credit scoring models under MNAR conditions.  相似文献   

3.
Consumer credit risk assessment involves the use of risk assessment tools to manage a borrower’s account from the time of pre-screening a potential application through to the management of the account during its life and possible write-off. The riskiness of lending to a credit applicant is usually estimated using a logistic regression model though researchers have considered many other types of classifier and whilst preliminary evidence suggest support vector machines seem to be the most accurate, data quality issues may prevent these laboratory based results from being achieved in practice. The training of a classifier on a sample of accepted applicants rather than on a sample representative of the applicant population seems not to result in bias though it does result in difficulties in setting the cut off. Profit scoring is a promising line of research and the Basel 2 accord has had profound implications for the way in which credit applicants are assessed and bank policies adopted.  相似文献   

4.
基于BP算法的信用风险评价模型研究   总被引:10,自引:1,他引:9  
本文利用神经网络技术建立基于 BP算法的信用风险评价模型 ,为我国某商业银行 12 0家贷款企业进行信用风险评价 ,按照企业的信用等级分为“信用好”、“信用中等”和“信用差”三个小组 .仿真结果表明 ,本文所建立的神经网络信用风险评价模型的分类准确率高于传统的参数统计分类方法——线性判别分析法的分类准确率 .文中还详细给出神经网络信用风险评价模型的网络构建方法及基于 BP网络的学习算法和步骤 .  相似文献   

5.
信用分类是信用风险管理中一个重要环节,其主要目的是根据信用申请客户提供的资料从申请客户中区分出可信客户和违约客户,以便为信用决策者提供决策依据.为了正确区分不同的信用客户,特别是违约客户,结合核主元分析和支持向量机算法构造基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型对信用数据进行了分类处理.在基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型中,首先对样本数据进行预处理,然后利用核主元分析以非线性方式降低数据的维数,最后利用带可变惩罚因子最小二乘模糊支持向量机模型对降维后数据进行分类分析.为了验证,选择两个公开的信用数据集来进行实证分析.实证结果表明:基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型取得了较好的分类结果,可为信用决策者提供重要的决策参考依据.  相似文献   

6.
标准支持向量机(SVM)抗噪声能力不强,当训练样本中存在有噪声或者野点时,会影响最优分类面的产生,最终导致分类结果出现偏差。针对这一问题,提出了一种考虑最小包围球的加权支持向量机(WSVM),给每个样本点赋予不同的权值,以此来降低噪声或野点对分类结果的影响。对江汉油田某区块的oilsk81,oilsk83和oilsk85三口油井的测井数据进行交叉验证,其中核函数采用了线性、指数和RBF这3种不同的核函数。测试结果显示,无论是在SVM还是在WSVM中,核函数选择RBF识别率都是最高的,同时提出的WSVM不受核函数的影响,识别稳定性好,且在交叉验证中识别率都能够达到100%。  相似文献   

7.
One of the aims of credit scoring models is to predict the probability of repayment of any applicant and yet such models are usually parameterised using a sample of accepted applicants only. This may lead to biased estimates of the parameters. In this paper we examine two issues. First, we compare the classification accuracy of a model based only on accepted applicants, relative to one based on a sample of all applicants. We find only a minimal difference, given the cutoff scores for the old model used by the data supplier. Using a simulated model we examine the predictive performance of models estimated from bands of applicants, ranked by predicted creditworthiness. We find that the lower the risk band of the training sample, the less accurate the predictions for all applicants. We also find that the lower the risk band of the training sample, the greater the overestimate of the true performance of the model, when tested on a sample of applicants within the same risk band — as a financial institution would do. The overestimation may be very large. Second, we examine the predictive accuracy of a bivariate probit model with selection (BVP). This parameterises the accept–reject model allowing for (unknown) omitted variables to be correlated with those of the original good–bad model. The BVP model may improve accuracy if the loan officer has overridden a scoring rule. We find that a small improvement when using the BVP model is sometimes possible.  相似文献   

8.
本文介绍了支持向量分类机,并引入具有更好识别能力的KMOD核函数建立了SVM信用卡分类模型.利用澳大利亚和德国的信用卡数据进行了数值实验,结果表明该模型在分类准确率、支持向量方面优于基于RBF的SVM模型.  相似文献   

9.
Support vector machines (SVMs) training may be posed as a large quadratic program (QP) with bound constraints and a single linear equality constraint. We propose a (block) coordinate gradient descent method for solving this problem and, more generally, linearly constrained smooth optimization. Our method is closely related to decomposition methods currently popular for SVM training. We establish global convergence and, under a local error bound assumption (which is satisfied by the SVM QP), linear rate of convergence for our method when the coordinate block is chosen by a Gauss-Southwell-type rule to ensure sufficient descent. We show that, for the SVM QP with n variables, this rule can be implemented in O(n) operations using Rockafellar’s notion of conformal realization. Thus, for SVM training, our method requires only O(n) operations per iteration and, in contrast to existing decomposition methods, achieves linear convergence without additional assumptions. We report our numerical experience with the method on some large SVM QP arising from two-class data classification. Our experience suggests that the method can be efficient for SVM training with nonlinear kernel.  相似文献   

10.
In the sequential evaluation and selection problem with n applicants, we assume that a decision maker has some prior information about each applicant so that unequal weights may be assigned to each applicant according to his or her likelihood of being the best among all applicants. Assuming that the pre-assigned weights are available in advance, we derive the optimal selection strategy that maximizes the probability of selecting the best among all applicants. For the case where the decision maker is permitted to rearrange the sequence in which applicants are evaluated, we further propose a simple heuristic procedure to the problem of optimally ordering the sequence of evaluations. Based on a pairwise comparison matrix and a goal programming procedure, we also propose a method that easily computes the weights in a practical situation.  相似文献   

11.
为了充分利用SVM在个人信用评估方面的优点、克服其不足,提出了基于支持向量机委员会机器的个人信用评估模型.将模型与基于属性效用函数估计构造新学习样本方法结合起来进行个人信用评估;经实证分析及与SVM方法对比发现,模型具有更好、更快、更多适应性的预测分类能力.  相似文献   

12.
If a credit scoring model is built using only applicants who have been previously accepted for credit such a non-random sample selection may produce bias in the estimated model parameters and accordingly the model's predictions of repayment performance may not be optimal. Previous empirical research suggests that omission of rejected applicants has a detrimental impact on model estimation and prediction. This paper explores the extent to which, given the previous cutoff score applied to decide on accepted applicants, the number of included variables influences the efficacy of a commonly used reject inference technique, reweighting. The analysis benefits from the availability of a rare sample, where virtually no applicant was denied credit. The general indication is that the efficacy of reject inference is little influenced by either model leanness or interaction between model leanness and the rejection rate that determined the sample. However, there remains some hint that very lean models may benefit from reject inference where modelling is conducted on data characterized by a very high rate of applicant rejection.  相似文献   

13.
We propose two methods for tuning membership functions of a kernel fuzzy classifier based on the idea of SVM (support vector machine) training. We assume that in a kernel fuzzy classifier a fuzzy rule is defined for each class in the feature space. In the first method, we tune the slopes of the membership functions at the same time so that the margin between classes is maximized under the constraints that the degree of membership to which a data sample belongs is the maximum among all the classes. This method is similar to a linear all-at-once SVM. We call this AAO tuning. In the second method, we tune the membership function of a class one at a time. Namely, for a class the slope of the associated membership function is tuned so that the margin between the class and the remaining classes is maximized under the constraints that the degrees of membership for the data belonging to the class are large and those for the remaining data are small. This method is similar to a linear one-against-all SVM. This is called OAA tuning. According to the computer experiment for fuzzy classifiers based on kernel discriminant analysis and those with ellipsoidal regions, usually both methods improve classification performance by tuning membership functions and classification performance by AAO tuning is slightly better than that by OAA tuning.  相似文献   

14.
利用传统支持向量机(SVM)对不平衡数据进行分类时,由于真实的少数类支持向量样本过少且难以被识别,造成了分类时效果不是很理想.针对这一问题,提出了一种基于支持向量机混合采样的不平衡数据分类方法(BSMS).该方法首先对经过支持向量机分类的原始不平衡数据按照所处位置的不同划分为支持向量区(SV),多数类非支持向量区(MN...  相似文献   

15.
The features used may have an important effect on the performance of credit scoring models. The process of choosing the best set of features for credit scoring models is usually unsystematic and dominated by somewhat arbitrary trial. This paper presents an empirical study of four machine learning feature selection methods. These methods provide an automatic data mining technique for reducing the feature space. The study illustrates how four feature selection methods—‘ReliefF’, ‘Correlation-based’, ‘Consistency-based’ and ‘Wrapper’ algorithms help to improve three aspects of the performance of scoring models: model simplicity, model speed and model accuracy. The experiments are conducted on real data sets using four classification algorithms—‘model tree (M5)’, ‘neural network (multi-layer perceptron with back-propagation)’, ‘logistic regression’, and ‘k-nearest-neighbours’.  相似文献   

16.
Corporate credit granting is a key commercial activity of financial institutions nowadays. A critical first step in the credit granting process usually involves a careful financial analysis of the creditworthiness of the potential client. Wrong decisions result either in foregoing valuable clients or, more severely, in substantial capital losses if the client subsequently defaults. It is thus of crucial importance to develop models that estimate the probability of corporate bankruptcy with a high degree of accuracy. Many studies focused on the use of financial ratios in linear statistical models, such as linear discriminant analysis and logistic regression. However, the obtained error rates are often high. In this paper, Least Squares Support Vector Machine (LS-SVM) classifiers, also known as kernel Fisher discriminant analysis, are applied within the Bayesian evidence framework in order to automatically infer and analyze the creditworthiness of potential corporate clients. The inferred posterior class probabilities of bankruptcy are then used to analyze the sensitivity of the classifier output with respect to the given inputs and to assist in the credit assignment decision making process. The suggested nonlinear kernel based classifiers yield better performances than linear discriminant analysis and logistic regression when applied to a real-life data set concerning commercial credit granting to mid-cap Belgian and Dutch firms.  相似文献   

17.
The support vector machine (SVM) is known for its good performance in two-class classification, but its extension to multiclass classification is still an ongoing research issue. In this article, we propose a new approach for classification, called the import vector machine (IVM), which is built on kernel logistic regression (KLR). We show that the IVM not only performs as well as the SVM in two-class classification, but also can naturally be generalized to the multiclass case. Furthermore, the IVM provides an estimate of the underlying probability. Similar to the support points of the SVM, the IVM model uses only a fraction of the training data to index kernel basis functions, typically a much smaller fraction than the SVM. This gives the IVM a potential computational advantage over the SVM.  相似文献   

18.
In credit card portfolio management, predicting the cardholder’s spending behavior is a key to reduce the risk of bankruptcy. Given a set of attributes for major aspects of credit cardholders and predefined classes for spending behaviors, this paper proposes a classification model by using multiple criteria linear programming to discover behavior patterns of credit cardholders. It shows a general classification model that can theoretically handle any class-size. Then, it focuses on a typical case where the cardholders’ behaviors are predefined as four classes. A dataset from a major US bank is used to demonstrate the applicability of the proposed method.  相似文献   

19.
Classification is a main data mining task, which aims at predicting the class label of new input data on the basis of a set of pre-classified samples. Multiple criteria linear programming (MCLP) is used as a classification method in the data mining area, which can separate two or more classes by finding a discriminate hyperplane. Although MCLP shows good performance in dealing with linear separable data, it is no longer applicable when facing with nonlinear separable problems. A kernel-based multiple criteria linear programming (KMCLP) model is developed to solve nonlinear separable problems. In this method, a kernel function is introduced to project the data into a higher-dimensional space in which the data will have more chance to be linear separable. KMCLP performs well in some real applications. However, just as other prevalent data mining classifiers, MCLP and KMCLP learn only from training examples. In the traditional machine learning area, there are also classification tasks in which data sets are classified only by prior knowledge, i.e. expert systems. Some works combine the above two classification principles to overcome the faults of each approach. In this paper, we provide our recent works which combine the prior knowledge and the MCLP or KMCLP model to solve the problem when the input consists of not only training examples, but also prior knowledge. Specifically, how to deal with linear and nonlinear knowledge in MCLP and KMCLP models is the main concern of this paper. Numerical tests on the above models indicate that these models are effective in classifying data with prior knowledge.  相似文献   

20.
Support vector machine (SVM) has attracted considerable attentions recently due to its successful applications in various domains. However, by maximizing the margin of separation between the two classes in a binary classification problem, the SVM solutions often suffer two serious drawbacks. First, SVM separating hyperplane is usually very sensitive to training samples since it strongly depends on support vectors which are only a few points located on the wrong side of the corresponding margin boundaries. Second, the separating hyperplane is equidistant to the two classes which are considered equally important when optimizing the separating hyperplane location regardless the number of training data and their dispersions in each class. In this paper, we propose a new SVM solution, adjusted support vector machine (ASVM), based on a new loss function to adjust the SVM solution taking into account the sample sizes and dispersions of the two classes. Numerical experiments show that the ASVM outperforms conventional SVM, especially when the two classes have large differences in sample size and dispersion.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号