首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 937 毫秒
1.
When designing rule-based models and classifiers, some precision is sacrificed to obtain linguistic interpretability. Understandable models are not expected to outperform black boxes, but usually fuzzy learning algorithms are statistically validated by contrasting them with black-box models. Unless performance of both approaches is equivalent, it is difficult to judge whether the fuzzy one is doing its best, because the precision gap between the best understandable model and the best black-box model is not known.In this paper we discuss how to generate probabilistic rule-based models and classifiers with the same structure as fuzzy rule-based ones. Fuzzy models, in which features are partitioned into linguistic terms, will be compared to probabilistic rule-based models with the same number of terms in every linguistic partition. We propose to use these probabilistic models to estimate a lower precision limit which fuzzy rule learning algorithms should surpass.  相似文献   

2.
支持向量机中的参数直接影响其推广能力,针对参数选取的主观性,提出基于改进的遗传算法优化其参数,并将其应用于银行个人信用的五等级分类问题中,针对多分类问题,设计了3个二值分类器,不同分类的参数不同,通过实验证实可以达到更精细的分类效果.  相似文献   

3.
Human judgment plays an important role in the rating of enterprise financial conditions. The recently developed fuzzy adaptive network (FAN), which can handle systems whose behaviour is influenced by human judgment, appears to be ideally suited for the modelling of this credit rating problem. In this paper, FAN is used to model the credit rating of small financial enterprises. To illustrate the approach, the data of the credit rating problem is first represented by the use of fuzzy numbers. Then, the FAN network based on inference rules is constructed. And finally, the network is trained or learned by using the fuzzy number training data. The main advantages of the proposed network are the ability for linguistic representation, linguistic aggregation and the learning ability of the neural network.  相似文献   

4.
Classifying magnetic resonance spectra is often difficult due to the curse of dimensionality; scenarios in which a high-dimensional feature space is coupled with a small sample size. We present an aggregation strategy that combines predicted disease states from multiple classifiers using several fuzzy integration variants. Rather than using all input features for each classifier, these multiple classifiers are presented with different, randomly selected, subsets of the spectral features. Results from a set of detailed experiments using this strategy are carefully compared against classification performance benchmarks. We empirically demonstrate that the aggregated predictions are consistently superior to the corresponding prediction from the best individual classifier.  相似文献   

5.
6.
Estimation of probability of default has considerable importance in risk management applications where default risk is referred to as credit risk. Basel II (Committee on Banking Supervision) proposes a revision to the international capital accord that implies a more prominent role for internal credit risk assessments based on the determination of default probability of borrowers. In our study, we classify borrower firms into rating classes with respect to their default probability. The classification of firms into rating classes necessitates the finding of threshold values separating the rating classes. We aim at solving two problems: to distinguish the defaults from non-defaults, and to put the firms in an order based on their credit quality and classify them into sub-rating classes. For using a model to obtain the probability of default of each firm, Receiver Operating Characteristics (ROC) analysis is employed to assess the distinction power of our model. In our new functional approach, we optimise the area under the ROC curve for a balanced choice of the thresholds; and we include accuracy of the solution into the program. Thus, a constrained optimisation problem on the area under the curve (or its complement) is carefully modelled, discretised and turned into a penalized sum-of-squares problem of nonlinear regression; we apply the Levenberg–Marquardt algorithm. We present numerical evaluations and their interpretations based on real-world data from firms in the Turkish manufacturing sector. We conclude with a discussion of structural frontiers, parametrical and computational features, and an invitation to future work.  相似文献   

7.
The determination of fuzzy information granules including the estimation of their membership functions play a significant role in fuzzy system design as well as in the design of fuzzy rule based classifiers (FRBCSs). However, although linguistic terms are fundamental elements in the process of elucidating expert’s knowledge, the problem of linguistic term design along with their fuzzy-set-based semantics has not been fully addressed, since term-sets of attributes have not been interpreted as a formalized structure. Thus, the essential relationship between linguistic terms, as syntax, and the constructed fuzzy sets, as their quantitative semantics, or in other words, the problem of the natural semantics of terms behind the linguistic literal has not been addressed. In this paper, we introduce the problem of the design of optimal linguistic terms and propose a method of the design of FRBCSs which may incorporate with the design of linguistic terms to ensure that the presence of linguistic literals are supported not only by data but also by their natural semantics. It is shown that this problem plays a primordial role in enhancing the performance and the interpretability of the designed FRBCSs and helps striking a better balance between the generality and the specificity of the desired fuzzy rule bases for fuzzy classification problems. A series of experiments concerning 17 Machine Learning datasets is reported.  相似文献   

8.
The introduction of the Basel II Capital Accord has encouraged financial institutions to build internal rating systems assessing the credit risk of their various credit portfolios. One of the key outputs of an internal rating system is the probability of default (PD), which reflects the likelihood that a counterparty will default on his/her financial obligation. Since the PD modelling problem basically boils down to a discrimination problem (defaulter or not), one may rely on the myriad of classification techniques that have been suggested in the literature. However, since the credit risk models will be subject to supervisory review and evaluation, they must be easy to understand and transparent. Hence, techniques such as neural networks or support vector machines are less suitable due to their black box nature. Building upon previous research, we will use AntMiner+ to build internal rating systems for credit risk. AntMiner+ allows to infer a propositional rule set from a given data set, hereby using the principles from Ant Colony Optimization. Experiments will be conducted using various types of credit data sets (retail, small- and medium-sized enterprises and banks). It will be shown that the extracted rule sets are both powerful in terms of discriminatory power and comprehensibility. Furthermore, a framework will be presented describing how AntMiner+ fits into a global Basel II credit risk management system.  相似文献   

9.
This paper proposes a comparative appraisal of the fuzzy classification methods which are Fuzzy C-Means, K Nearest Neighbours, method based on Fuzzy Rules and Fuzzy Pattern Matching method. It presents the results we obtained in applying those methods on three types of data that we present in the second part of this article. The classification rate and the computing times are compared from a method to another. This paper describes the advantages of the fuzzy classifiers for an application to a diagnosis problem. To finish it proposes a synthesis of our study which can constitute a base to choose an algorithm in order to apply it to a process diagnosis in real time. It shows how we can associate unsupervised and supervised methods in a diagnosis algorithm.  相似文献   

10.
利用中小微企业的进销项发票数据,对中小微企业的信贷风险进行评估,并给出最优贷款策略.首先,建立了企业实力-信誉指标体系,并通过优化模型得到有信誉等级和违约记录的123家企业的最优贷款策略;然后,应用WOE-Logistic评分卡模型对无信誉等级的302家企业进行信誉评级,通过上述实力-信誉指标体系和优化模型得到最优贷款...  相似文献   

11.
Classification problems with multiple classes suppose a challenge in Data Mining tasks. There is a difficulty inherent to the learning process when trying to find the most adequate discrimination functions among the different concepts within the dataset. Using Fuzzy Rule Based Classification Systems in general, and Evolutionary Fuzzy Systems in particular, provide the advantage of describing smoother borderline areas, thanks to the linguistic label-based representation.In multi-classification, the pairwise learning approach (One-vs-One) has gained a notorious attention. However, there is certain dependence between the goodness of the confidence degrees or scores of binary classifiers, and the final performance shown by the global model. Regarding this fact, the problem of non-competent classifiers is of special relevance. It occurs when a binary classifier outputs a positive score for a couple of classes unrelated with the input example, which may degrade the final accuracy. Precisely, the previously exposed properties of fuzzy classifiers make them more prone to the former condition.In this paper, we propose an extension of the distance-based combination strategy to overcome this non-competence problem. It is based on the truncation of the confidence degrees of the classes prior to the distance-based tuning. This allows taking advantage of the good classification abilities of Evolutionary Fuzzy Systems, while diminishing the adverse effect of the aforementioned non-competence. Experimental results, using FARC-HD with overlap functions as the fuzzy learning algorithm, show that this new adaptation of the Distance-based Relative Competence Weighting model outperforms both the OVO and standard distance-based approaches, and it is competitive with robust classifiers such as Support Vector Machines.  相似文献   

12.
In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.  相似文献   

13.
Learning from examples is a frequently arising challenge, with a large number of algorithms proposed in the classification, data mining and machine learning literature. The evaluation of the quality of such algorithms is frequently carried out ex post, on an experimental basis: their performance is measured either by cross validation on benchmark data sets, or by clinical trials. Few of these approaches evaluate the learning process ex ante, on its own merits. In this paper, we discuss a property of rule-based classifiers which we call “justifiability”, and which focuses on the type of information extracted from the given training set in order to classify new observations. We investigate some interesting mathematical properties of justifiable classifiers. In particular, we establish the existence of justifiable classifiers, and we show that several well-known learning approaches, such as decision trees or nearest neighbor based methods, automatically provide justifiable classifiers. We also identify maximal subsets of observations which must be classified in the same way by every justifiable classifiers. Finally, we illustrate by a numerical example that using classifiers based on “most justifiable” rules does not seem to lead to overfitting, even though it involves an element of optimization.  相似文献   

14.
We develop several variable selection methods using signomial function to select relevant variables for multi-class classification by taking all classes into consideration. We introduce a \(\ell _{1}\)-norm regularization function to measure the number of selected variables and two adaptive parameters to apply different importance weights for different variables according to their relative importance. The proposed methods select variables suitable for predicting the output and automatically determine the number of variables to be selected. Then, with the selected variables, they naturally obtain the resulting classifiers without an additional classification process. The classifiers obtained by the proposed methods yield competitive or better classification accuracy levels than those by the existing methods.  相似文献   

15.
This paper examines the interpretability-accuracy tradeoff in fuzzy rule-based classifiers using a multiobjective fuzzy genetics-based machine learning (GBML) algorithm. Our GBML algorithm is a hybrid version of Michigan and Pittsburgh approaches, which is implemented in the framework of evolutionary multiobjective optimization (EMO). Each fuzzy rule is represented by its antecedent fuzzy sets as an integer string of fixed length. Each fuzzy rule-based classifier, which is a set of fuzzy rules, is represented as a concatenated integer string of variable length. Our GBML algorithm simultaneously maximizes the accuracy of rule sets and minimizes their complexity. The accuracy is measured by the number of correctly classified training patterns while the complexity is measured by the number of fuzzy rules and/or the total number of antecedent conditions of fuzzy rules. We examine the interpretability-accuracy tradeoff for training patterns through computational experiments on some benchmark data sets. A clear tradeoff structure is visualized for each data set. We also examine the interpretability-accuracy tradeoff for test patterns. Due to the overfitting to training patterns, a clear tradeoff structure is not always obtained in computational experiments for test patterns.  相似文献   

16.
信用分类是信用风险管理中一个重要环节,其主要目的是根据信用申请客户提供的资料从申请客户中区分出可信客户和违约客户,以便为信用决策者提供决策依据.为了正确区分不同的信用客户,特别是违约客户,结合核主元分析和支持向量机算法构造基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型对信用数据进行了分类处理.在基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型中,首先对样本数据进行预处理,然后利用核主元分析以非线性方式降低数据的维数,最后利用带可变惩罚因子最小二乘模糊支持向量机模型对降维后数据进行分类分析.为了验证,选择两个公开的信用数据集来进行实证分析.实证结果表明:基于核主元分析的带可变惩罚因子最小二乘模糊支持向量机模型取得了较好的分类结果,可为信用决策者提供重要的决策参考依据.  相似文献   

17.
In this paper, we propose a genetic programming (GP) based approach to evolve fuzzy rule based classifiers. For a c-class problem, a classifier consists of c trees. Each tree, T i , of the multi-tree classifier represents a set of rules for class i. During the evolutionary process, the inaccurate/inactive rules of the initial set of rules are removed by a cleaning scheme. This allows good rules to sustain and that eventually determines the number of rules. In the beginning, our GP scheme uses a randomly selected subset of features and then evolves the features to be used in each rule. The initial rules are constructed using prototypes, which are generated randomly as well as by the fuzzy k-means (FKM) algorithm. Besides, experiments are conducted in three different ways: Using only randomly generated rules, using a mixture of randomly generated rules and FKM prototype based rules, and with exclusively FKM prototype based rules. The performance of the classifiers is comparable irrespective of the type of initial rules. This emphasizes the novelty of the proposed evolutionary scheme. In this context, we propose a new mutation operation to alter the rule parameters. The GP scheme optimizes the structure of rules as well as the parameters involved. The method is validated on six benchmark data sets and the performance of the proposed scheme is found to be satisfactory.  相似文献   

18.
Fierce competition as well as the recent financial crisis in financial and banking industries made credit scoring gain importance. An accurate estimation of credit risk helps organizations to decide whether or not to grant credit to potential customers. Many classification methods have been suggested to handle this problem in the literature. This paper proposes a model for evaluating credit risk based on binary quantile regression, using Bayesian estimation. This paper points out the distinct advantages of the latter approach: that is (i) the method provides accurate predictions of which customers may default in the future, (ii) the approach provides detailed insight into the effects of the explanatory variables on the probability of default, and (iii) the methodology is ideally suited to build a segmentation scheme of the customers in terms of risk of default and the corresponding uncertainty about the prediction. An often studied dataset from a German bank is used to show the applicability of the method proposed. The results demonstrate that the methodology can be an important tool for credit companies that want to take the credit risk of their customer fully into account.  相似文献   

19.
Mathematical programming (MP) discriminant analysis models can be used to develop classification models for assigning observations of unknown class membership to one of a number of specified classes using values of a set of features associated with each observation. Since most MP discriminant analysis models generate linear discriminant functions, these MP models are generally used to develop linear classification models. Nonlinear classifiers may, however, have better classification performance than linear classifiers. In this paper, a mixed integer programming model is developed to generate nonlinear discriminant functions composed of monotone piecewise-linear marginal utility functions for each feature and the cut-off value for class membership. It is also shown that this model can be extended for feature selection. The performance of this new MP model for two-group discriminant analysis is compared with statistical discriminant analysis and other MP discriminant analysis models using a real problem and a number of simulated problem sets.  相似文献   

20.
本文通过银行的资产质量方面、资本充足率方面、管控效能层面、盈利状态层面、流动性层面与社会敏感度层面等构建商业银行信用风险评价体系。根据平滑扩充原理模拟生成大样本数据,对评级得分进行扩充,进而根据扩充后的大样本数据划分银行的信用风险等级。解决了由于样本少、无法对信用等级合理划分的难题。通过实证分析可以了解到,本文得出的银行评级信息和标准普尔提供的评价结论存在共同的序关系状态。因此,可根据本模型对大多数未经过国际权威机构评级的银行进行风险评级。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号