首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
信用分类是信用风险管理中一个重要环节,其主要目的是根据信用申请客户提供的资料从申请客户中区分出可信客户和违约客户,以便为信用决策者提供决策依据.为了正确区分不同的信用客户,特别是违约客户,结合核主元分析和支持向量机算法构造基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型对信用数据进行了分类处理.在基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型中,首先对样本数据进行预处理,然后利用核主元分析以非线性方式降低数据的维数,最后利用带可变惩罚因子最小二乘模糊支持向量机模型对降维后数据进行分类分析.为了验证,选择两个公开的信用数据集来进行实证分析.实证结果表明:基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型取得了较好的分类结果,可为信用决策者提供重要的决策参考依据.  相似文献   

2.
We propose two methods for tuning membership functions of a kernel fuzzy classifier based on the idea of SVM (support vector machine) training. We assume that in a kernel fuzzy classifier a fuzzy rule is defined for each class in the feature space. In the first method, we tune the slopes of the membership functions at the same time so that the margin between classes is maximized under the constraints that the degree of membership to which a data sample belongs is the maximum among all the classes. This method is similar to a linear all-at-once SVM. We call this AAO tuning. In the second method, we tune the membership function of a class one at a time. Namely, for a class the slope of the associated membership function is tuned so that the margin between the class and the remaining classes is maximized under the constraints that the degrees of membership for the data belonging to the class are large and those for the remaining data are small. This method is similar to a linear one-against-all SVM. This is called OAA tuning. According to the computer experiment for fuzzy classifiers based on kernel discriminant analysis and those with ellipsoidal regions, usually both methods improve classification performance by tuning membership functions and classification performance by AAO tuning is slightly better than that by OAA tuning.  相似文献   

3.
A Clifford support vector machine (CSVM) learns the decision surface from multi distinct classes of the multiple input points using the Clifford geometric algebra. In many applications, each multiple input point may not be fully assigned to one of these multi-classes. In this paper, we apply a fuzzy membership to each multiple input point and reformulate the CSVM for multiclass classification to make different input points have their own different contributions to the learning of decision surface. We call the proposed method Clifford fuzzy SVM.  相似文献   

4.
The credit scoring is a risk evaluation task considered as a critical decision for financial institutions in order to avoid wrong decision that may result in huge amount of losses. Classification models are one of the most widely used groups of data mining approaches that greatly help decision makers and managers to reduce their credit risk of granting credits to customers instead of intuitive experience or portfolio management. Accuracy is one of the most important criteria in order to choose a credit‐scoring model; and hence, the researches directed at improving upon the effectiveness of credit scoring models have never been stopped. In this article, a hybrid binary classification model, namely FMLP, is proposed for credit scoring, based on the basic concepts of fuzzy logic and artificial neural networks (ANNs). In the proposed model, instead of crisp weights and biases, used in traditional multilayer perceptrons (MLPs), fuzzy numbers are used in order to better model of the uncertainties and complexities in financial data sets. Empirical results of three well‐known benchmark credit data sets indicate that hybrid proposed model outperforms its component and also other those classification models such as support vector machines (SVMs), K‐nearest neighbor (KNN), quadratic discriminant analysis (QDA), and linear discriminant analysis (LDA). Therefore, it can be concluded that the proposed model can be an appropriate alternative tool for financial binary classification problems, especially in high uncertainty conditions. © 2013 Wiley Periodicals, Inc. Complexity 18: 46–57, 2013  相似文献   

5.
We propose a classification approach exploiting relationships between ellipsoidal separation and Support-vector Machine (SVM) with quadratic kernel. By adding a (Semidefinite Programming) SDP constraint to SVM model we ensure that the chosen hyperplane in feature space represents a non-degenerate ellipsoid in input space. This allows us to exploit SDP techniques within Support-vector Regression (SVR) approaches, yielding better results in case ellipsoid-shaped separators are appropriate for classification tasks. We compare our approach with spherical separation and SVM on some classification problems.  相似文献   

6.
Support vector machine (SVM) is a popular tool for machine learning task. It has been successfully applied in many fields, but the parameter optimization for SVM is an ongoing research issue. In this paper, to tune the parameters of SVM, one form of inter-cluster distance in the feature space is calculated for all the SVM classifiers of multi-class problems. Inter-cluster distance in the feature space shows the degree the classes are separated. A larger inter-cluster distance value implies a pair of more separated classes. For each classifier, the optimal kernel parameter which results in the largest inter-cluster distance is found. Then, a new continuous search interval of kernel parameter which covers the optimal kernel parameter of each class pair is determined. Self-adaptive differential evolution algorithm is used to search the optimal parameter combination in the continuous intervals of kernel parameter and penalty parameter. At last, the proposed method is applied to several real word datasets as well as fault diagnosis for rolling element bearings. The results show that it is both effective and computationally efficient for parameter optimization of multi-class SVM.  相似文献   

7.
Credit scoring is a method of modelling potential risk of credit applications. Traditionally, logistic regression and discriminant analysis are the most widely used approaches to create scoring models in the industry. However, these methods are associated with quite a few limitations, such as being instable with high-dimensional data and small sample size, intensive variable selection effort and incapability of efficiently handling non-linear features. Most importantly, based on these algorithms, it is difficult to automate the modelling process and when population changes occur, the static models usually fail to adapt and may need to be rebuilt from scratch. In the last few years, the kernel learning approach has been investigated to solve these problems. However, the existing applications of this type of methods (in particular the SVM) in credit scoring have all focused on the batch model and did not address the important problem of how to update the scoring model on-line. This paper presents a novel and practical adaptive scoring system based on an incremental kernel method. With this approach, the scoring model is adjusted according to an on-line update procedure that can always converge to the optimal solution without information loss or running into numerical difficulties. Non-linear features in the data are automatically included in the model through a kernel transformation. This approach does not require any variable reduction effort and is also robust for scoring data with a large number of attributes and highly unbalanced class distributions. Moreover, a new potential kernel function is introduced to further improve the predictive performance of the scoring model and a kernel attribute ranking technique is used that adds transparency in the final model. Experimental studies using real world data sets have demonstrated the effectiveness of the proposed method.  相似文献   

8.
Credit applicants are assigned to good or bad risk classes according to their record of defaulting. Each applicant is described by a high-dimensional input vector of situational characteristics and by an associated class label. A statistical model, which maps the inputs to the labels, can decide whether a new credit applicant should be accepted or rejected, by predicting the class label given the new inputs. Support vector machines (SVM) from statistical learning theory can build such models from the data, requiring extremely weak prior assumptions about the model structure. Furthermore, SVM divide a set of labelled credit applicants into subsets of ‘typical’ and ‘critical’ patterns. The correct class label of a typical pattern is usually very easy to predict, even with linear classification methods. Such patterns do not contain much information about the classification boundary. The critical patterns (the support vectors) contain the less trivial training examples. For instance, linear discriminant analysis with prior training subset selection via SVM also leads to improved generalization. Using non-linear SVM, more ‘surprising’ critical regions may be detected, but owing to the relative sparseness of the data, this potential seems to be limited in credit scoring practice.  相似文献   

9.
Unsupervised classification is a highly important task of machine learning methods. Although achieving great success in supervised classification, support vector machine (SVM) is much less utilized to classify unlabeled data points, which also induces many drawbacks including sensitive to nonlinear kernels and random initializations, high computational cost, unsuitable for imbalanced datasets. In this paper, to utilize the advantages of SVM and overcome the drawbacks of SVM-based clustering methods, we propose a completely new two-stage unsupervised classification method with no initialization: a new unsupervised kernel-free quadratic surface SVM (QSSVM) model is proposed to avoid selecting kernels and related kernel parameters, then a golden-section algorithm is designed to generate the appropriate classifier for balanced and imbalanced data. By studying certain properties of proposed model, a convergent decomposition algorithm is developed to implement this non-covex QSSVM model effectively and efficiently (in terms of computational cost). Numerical tests on artificial and public benchmark data indicate that the proposed unsupervised QSSVM method outperforms well-known clustering methods (including SVM-based and other state-of-the-art methods), particularly in terms of classification accuracy. Moreover, we extend and apply the proposed method to credit risk assessment by incorporating the T-test based feature weights. The promising numerical results on benchmark personal credit data and real-world corporate credit data strongly demonstrate the effectiveness, efficiency and interpretability of proposed method, as well as indicate its significant potential in certain real-world applications.  相似文献   

10.
Credit risk analysis is an active research area in financial risk management and credit scoring is one of the key analytical techniques in credit risk evaluation. In this study, a novel intelligent-agent-based fuzzy group decision making (GDM) model is proposed as an effective multicriteria decision analysis (MCDA) tool for credit risk evaluation. In this proposed model, some artificial intelligent techniques, which are used as intelligent agents, are first used to analyze and evaluate the risk levels of credit applicants over a set of pre-defined criteria. Then these evaluation results, generated by different intelligent agents, are fuzzified into some fuzzy opinions on credit risk level of applicants. Finally, these fuzzification opinions are aggregated into a group consensus and meantime the fuzzy aggregated consensus is defuzzified into a crisp aggregated value to support final decision for decision-makers of credit-granting institutions. For illustration and verification purposes, a simple numerical example and three real-world credit application approval datasets are presented.  相似文献   

11.
The logistic regression framework has been for long time the most used statistical method when assessing customer credit risk. Recently, a more pragmatic approach has been adopted, where the first issue is credit risk prediction, instead of explanation. In this context, several classification techniques have been shown to perform well on credit scoring, such as support vector machines among others. While the investigation of better classifiers is an important research topic, the specific methodology chosen in real world applications has to deal with the challenges arising from the real world data collected in the industry. Such data are often highly unbalanced, part of the information can be missing and some common hypotheses, such as the i.i.d. one, can be violated. In this paper we present a case study based on a sample of IBM Italian customers, which presents all the challenges mentioned above. The main objective is to build and validate robust models, able to handle missing information, class unbalancedness and non-iid data points. We define a missing data imputation method and propose the use of an ensemble classification technique, subagging, particularly suitable for highly unbalanced data, such as credit scoring data. Both the imputation and subagging steps are embedded in a customized cross-validation loop, which handles dependencies between different credit requests. The methodology has been applied using several classifiers (kernel support vector machines, nearest neighbors, decision trees, Adaboost) and their subagged versions. The use of subagging improves the performance of the base classifier and we will show that subagging decision trees achieve better performance, still keeping the model simple and reasonably interpretable.  相似文献   

12.
为了充分利用SVM在个人信用评估方面的优点、克服其不足,提出了基于支持向量机委员会机器的个人信用评估模型.将模型与基于属性效用函数估计构造新学习样本方法结合起来进行个人信用评估;经实证分析及与SVM方法对比发现,模型具有更好、更快、更多适应性的预测分类能力.  相似文献   

13.
During the last years, kernel based methods proved to be very successful for many real-world learning problems. One of the main reasons for this success is the efficiency on large data sets which is a result of the fact that kernel methods like support vector machines (SVM) are based on a convex optimization problem. Solving a new learning problem can now often be reduced to the choice of an appropriate kernel function and kernel parameters. However, it can be shown that even the most powerful kernel methods can still fail on quite simple data sets in cases where the inherent feature space induced by the used kernel function is not sufficient. In these cases, an explicit feature space transformation or detection of latent variables proved to be more successful. Since such an explicit feature construction is often not feasible for large data sets, the ultimate goal for efficient kernel learning would be the adaptive creation of new and appropriate kernel functions. It can, however, not be guaranteed that such a kernel function still leads to a convex optimization problem for Support Vector Machines. Therefore, we have to enhance the optimization core of the learning method itself before we can use it with arbitrary, i.e., non-positive semidefinite, kernel functions. This article motivates the usage of appropriate feature spaces and discusses the possible consequences leading to non-convex optimization problems. We will show that these new non-convex optimization SVM are at least as accurate as their quadratic programming counterparts on eight real-world benchmark data sets in terms of the generalization performance. They always outperform traditional approaches in terms of the original optimization problem. Additionally, the proposed algorithm is more generic than existing traditional solutions since it will also work for non-positive semidefinite or indefinite kernel functions.  相似文献   

14.
We consider linear programming approaches for support vector machines (SVM). The linear programming problems are introduced as an approximation of the quadratic programming problems commonly used in SVM. When we consider the kernel based nonlinear discriminators, the approximation can be viewed as kernel principle component analysis which generates an important subspace from the feature space characterized the kernel function. We show that any data points nonlinearly, and implicitly, projected into the feature space by kernel functions can be approximately expressed as points lying a low dimensional Euclidean space explicitly, which enables us to develop linear programming formulations for nonlinear discriminators. We also introduce linear programming formulations for multicategory classification problems. We show that the same maximal margin principle exploited in SVM can be involved into the linear programming formulations. Moreover, considering the low dimensional feature subspace extraction, we can generate nonlinear multicategory discriminators by solving linear programming problems.Numerical experiments on real world datasets are presented. We show that the fairly low dimensional feature subspace can achieve a reasonable accuracy, and that the linear programming formulations calculate discriminators efficiently. We also discuss a sampling strategy which might be crucial for huge datasets.  相似文献   

15.
在支持向量机预测建模中,核函数用来将低维特征空间中的非线性问题映射为高维特征空间中的线性问题.核函数的特征对于支持向量机的学习和预测都有很重要的影响.考虑到两种典型核函数—全局核(多项式核函数)和局部核(RBF核函数)在拟合与泛化方面的特性,采用了一种基于混合核函数的支持向量机方法用于预测建模.为了评价不同核函数的建模效果、得到更好的预测性能,采用遗传算法自适应进化支持向量机模型的各项参数,并将其应用于装备费用预测的实际问题中.实际计算表明采用混合核函数的支持向量机较单一核函数时有更好的预测性能,可以作为一种有效的预测建模方法在装备管理中推广应用.  相似文献   

16.
In recent years, support vector machines (SVMs) were successfully applied to a wide range of applications. However, since the classifier is described as a complex mathematical function, it is rather incomprehensible for humans. This opacity property prevents them from being used in many real-life applications where both accuracy and comprehensibility are required, such as medical diagnosis and credit risk evaluation. To overcome this limitation, rules can be extracted from the trained SVM that are interpretable by humans and keep as much of the accuracy of the SVM as possible. In this paper, we will provide an overview of the recently proposed rule extraction techniques for SVMs and introduce two others taken from the artificial neural networks domain, being Trepan and G-REX. The described techniques are compared using publicly available datasets, such as Ripley’s synthetic dataset and the multi-class iris dataset. We will also look at medical diagnosis and credit scoring where comprehensibility is a key requirement and even a regulatory recommendation. Our experiments show that the SVM rule extraction techniques lose only a small percentage in performance compared to SVMs and therefore rank at the top of comprehensible classification techniques.  相似文献   

17.
本文介绍了支持向量分类机,并引入具有更好识别能力的KMOD核函数建立了SVM信用卡分类模型.利用澳大利亚和德国的信用卡数据进行了数值实验,结果表明该模型在分类准确率、支持向量方面优于基于RBF的SVM模型.  相似文献   

18.
客户信用评估是银行等金融企业日常经营活动中的重要组成部分。一般违约样本在客户总体中只占少数,而能按时还款客户样本占多数,这就是客户信用评估中常见的类别不平衡问题。目前,用于客户信用评估的方法尚不能有效解决少数类样本稀缺带来的类别不平衡。本研究引入迁移学习技术整合系统内外部信息,以解决少数类样本稀缺带来的类别不平衡问题。为了提高对来自系统外部少数类样本信息的使用效率,构建了一种新的迁移学习模型:以基于集成技术的迁移装袋模型为基础,使用两阶段抽样和数据分组处理技术分别对其基模型生成和集成策略进行改进。运用重庆某商业银行信用卡客户数据进行的实证研究结果表明:与目前客户信用评估的常用方法相比,新模型能更好地处理绝对稀缺条件下类别不平衡对客户信用评估的影响,特别对占少数的违约客户有更好的预测精度。  相似文献   

19.
运用支持向量机对车牌字符进行识别,解决了由于图像受客观条件的影响、样本数量不是很大等原因导致的识别率不高的问题.主要针对车牌字符中的数字进行实验,选取了15组数字样本,8组进行训练,7组进行测试,采用交叉验证的思想对SVM进行参数C与g的寻优,并选择合适的核函数,对样本进行训练和预测,对于某些数字的识别率可达到100%,并在相同的训练集和测试集下与BP网络的识别效果进行对比.实验结果表明,SVM在训练样本较少且无字符特征提取的情况下具有很好的识别率,并且有很好的分类推广能力.  相似文献   

20.
This paper proposes fuzzy symbolic modeling as a framework for intelligent data analysis and model interpretation in classification and regression problems. The fuzzy symbolic modeling approach is based on the eigenstructure analysis of the data similarity matrix to define the number of fuzzy rules in the model. Each fuzzy rule is associated with a symbol and is defined by a Gaussian membership function. The prototypes for the rules are computed by a clustering algorithm, and the model output parameters are computed as the solutions of a bounded quadratic optimization problem. In classification problems, the rules’ parameters are interpreted as the rules’ confidence. In regression problems, the rules’ parameters are used to derive rules’ confidences for classes that represent ranges of output variable values. The resulting model is evaluated based on a set of benchmark datasets for classification and regression problems. Nonparametric statistical tests were performed on the benchmark results, showing that the proposed approach produces compact fuzzy models with accuracy comparable to models produced by the standard modeling approaches. The resulting model is also exploited from the interpretability point of view, showing how the rule weights provide additional information to help in data and model understanding, such that it can be used as a decision support tool for the prediction of new data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号