首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 262 毫秒
1.
Kernel logistic regression (KLR) is a very powerful algorithm that has been shown to be very competitive with many state-of the art machine learning algorithms such as support vector machines (SVM). Unlike SVM, KLR can be easily extended to multi-class problems and produces class posterior probability estimates making it very useful for many real world applications. However, the training of KLR using gradient based methods or iterative re-weighted least squares can be unbearably slow for large datasets. Coupled with poor conditioning and parameter tuning, training KLR can quickly design matrix become infeasible for some real datasets. The goal of this paper is to present simple, fast, scalable, and efficient algorithms for learning KLR. First, based on a simple approximation of the logistic function, a least square algorithm for KLR is derived that avoids the iterative tuning of gradient based methods. Second, inspired by the extreme learning machine (ELM) theory, an explicit feature space is constructed through a generalized single hidden layer feedforward network and used for training iterative re-weighted least squares KLR (IRLS-KLR) and the newly proposed least squares KLR (LS-KLR). Finally, for large-scale and/or poorly conditioned problems, a robust and efficient preconditioned learning technique is proposed for learning the algorithms presented in the paper. Numerical results on a series of artificial and 12 real bench-mark datasets show first that LS-KLR compares favorable with SVM and traditional IRLS-KLR in terms of accuracy and learning speed. Second, the extension of ELM to KLR results in simple, scalable and very fast algorithms with comparable generalization performance to their original versions. Finally, the introduced preconditioned learning method can significantly increase the learning speed of IRLS-KLR.  相似文献   

2.
Methods for analyzing or learning from “fuzzy data” have attracted increasing attention in recent years. In many cases, however, existing methods (for precise, non-fuzzy data) are extended to the fuzzy case in an ad-hoc manner, and without carefully considering the interpretation of a fuzzy set when being used for modeling data. Distinguishing between an ontic and an epistemic interpretation of fuzzy set-valued data, and focusing on the latter, we argue that a “fuzzification” of learning algorithms based on an application of the generic extension principle is not appropriate. In fact, the extension principle fails to properly exploit the inductive bias underlying statistical and machine learning methods, although this bias, at least in principle, offers a means for “disambiguating” the fuzzy data. Alternatively, we therefore propose a method which is based on the generalization of loss functions in empirical risk minimization, and which performs model identification and data disambiguation simultaneously. Elaborating on the fuzzification of specific types of losses, we establish connections to well-known loss functions in regression and classification. We compare our approach with related methods and illustrate its use in logistic regression for binary classification.  相似文献   

3.
An approach to dealing with missing data, both during the design and normal operation of a neuro-fuzzy classifier is presented in this paper. Missing values are processed within a general fuzzy min–max neural network architecture utilising hyperbox fuzzy sets as input data cluster prototypes. An emphasis is put on ways of quantifying the uncertainty which missing data might have caused. This takes a form of classification procedure whose primary objective is the reduction of a number of viable alternatives rather than attempting to produce one winning class without supporting evidence. If required, the ways of selecting the most probable class among the viable alternatives found during the primary classification step, which are based on utilising the data frequency information, are also proposed. The reliability of the classification and the completeness of information is communicated by producing upper and lower classification membership values similar in essence to plausibility and belief measures to be found in the theory of evidence or possibility and necessity values to be found in the fuzzy sets theory. Similarities and differences between the proposed method and various fuzzy, neuro-fuzzy and probabilistic algorithms are also discussed. A number of simulation results for well-known data sets are provided in order to illustrate the properties and performance of the proposed approach.  相似文献   

4.
In this paper we present a new approach on optimal forecasting by using the fuzzy set theory and soft computing methods for the dynamic data analysis. This research is based on the concepts of fuzzy membership function as well as the natural selection of evolution theory. Some discussions in the sensitivity of the design of fuzzy processing will be provided. Through the design of genetic evolution, the AIC criteria is used as the adjust function, and the fuzzy memberships function of each gene model are calculated. Simulation and empirical examples show that our proposed forecasting technique can give an optimal forecasting in time series analysis.  相似文献   

5.
Data envelopment analysis (DEA) is a methodology for measuring the relative efficiencies of a set of decision making units (DMUs) that use multiple inputs to produce multiple outputs. Crisp input and output data are fundamentally indispensable in conventional DEA. However, the observed values of the input and output data in real-world problems are sometimes imprecise or vague. Many researchers have proposed various fuzzy methods for dealing with the imprecise and ambiguous data in DEA. In this study, we provide a taxonomy and review of the fuzzy DEA methods. We present a classification scheme with four primary categories, namely, the tolerance approach, the α-level based approach, the fuzzy ranking approach and the possibility approach. We discuss each classification scheme and group the fuzzy DEA papers published in the literature over the past 20 years. To the best of our knowledge, this paper appears to be the only review and complete source of references on fuzzy DEA.  相似文献   

6.
This paper proposes fuzzy symbolic modeling as a framework for intelligent data analysis and model interpretation in classification and regression problems. The fuzzy symbolic modeling approach is based on the eigenstructure analysis of the data similarity matrix to define the number of fuzzy rules in the model. Each fuzzy rule is associated with a symbol and is defined by a Gaussian membership function. The prototypes for the rules are computed by a clustering algorithm, and the model output parameters are computed as the solutions of a bounded quadratic optimization problem. In classification problems, the rules’ parameters are interpreted as the rules’ confidence. In regression problems, the rules’ parameters are used to derive rules’ confidences for classes that represent ranges of output variable values. The resulting model is evaluated based on a set of benchmark datasets for classification and regression problems. Nonparametric statistical tests were performed on the benchmark results, showing that the proposed approach produces compact fuzzy models with accuracy comparable to models produced by the standard modeling approaches. The resulting model is also exploited from the interpretability point of view, showing how the rule weights provide additional information to help in data and model understanding, such that it can be used as a decision support tool for the prediction of new data.  相似文献   

7.
8.
In machine learning problems, the availability of several classifiers trained on different data or features makes the combination of pattern classifiers of great interest. To combine distinct sources of information, it is necessary to represent the outputs of classifiers in a common space via a transformation called calibration. The most classical way is to use class membership probabilities. However, using a single probability measure may be insufficient to model the uncertainty induced by the calibration step, especially in the case of few training data. In this paper, we extend classical probabilistic calibration methods to the evidential framework. Experimental results from the calibration of SVM classifiers show the interest of using belief functions in classification problems.  相似文献   

9.
Optimization theory provides a framework for determining the best decisions or actions with respect to some mathematical model of a process. This paper focuses on learning to act in a near-optimal manner through reinforcement learning for problems that either have no model or the model is too complex. One approach to solving this class of problems is via approximate dynamic programming. The application of these methods are established primarily for the case of discrete state and action spaces. In this paper we develop efficient methods of learning which act in complex systems with continuous state and action spaces. Monte-Carlo approaches are employed to estimate function values in an iterative, incremental procedure. Derivative-free line search methods are used to obtain a near-optimal action in the continuous action space for a discrete subset of the state space. This near-optimal control policy is then extended to the entire continuous state space via a fuzzy additive model. To compensate for approximation errors, a modified procedure for perturbing the generated control policy is developed. Convergence results under moderate assumptions and stopping criteria are established.  相似文献   

10.
Pattern classification is one of the main themes in pattern recognition, and has been tackled by several methods such as the statistic one, artificial neural networks, mathematical programming and so on. Among them, the multi-surface method proposed by Mangasarian is very attractive, because it can provide an exact discrimination function even for highly nonlinear problems without any assumption on the data distribution. However, the method often causes many slits on the discrimination curve. In other words, the piecewise linear discrimination curve is sometimes too complex resulting in a poor generalization ability. In this paper, several trials in order to overcome the difficulties of the multi-surface method are suggested. One of them is the utilization of goal programming in which the auxiliary linear programming problem is formulated as a goal programming in order to get as simple discrimination curves as possible. Another one is to apply fuzzy programming by which we can get fuzzy discrimination curves with gray zones. In addition, it will be shown that using the suggested methods, the additional learning can be easily made. These features of the methods make the discrimination more realistic. The effectiveness of the methods is shown on the basis of some applications.  相似文献   

11.
《Fuzzy Sets and Systems》2004,141(1):47-58
This paper presents a novel boosting algorithm for genetic learning of fuzzy classification rules. The method is based on the iterative rule learning approach to fuzzy rule base system design. The fuzzy rule base is generated in an incremental fashion, in that the evolutionary algorithm optimizes one fuzzy classifier rule at a time. The boosting mechanism reduces the weight of those training instances that are classified correctly by the new rule. Therefore, the next rule generation cycle focuses on fuzzy rules that account for the currently uncovered or misclassified instances. The weight of a fuzzy rule reflects the relative strength the boosting algorithm assigns to the rule class when it aggregates the casted votes. The approach is compared with other classification algorithms for a number problem sets from the UCI repository.  相似文献   

12.
In recent years in the fields of statistics and machine learning an increasing amount of so called local classification methods has been developed. Local approaches to classification are not new, but have lately become popular. Well-known examples are the $k$ nearest neighbors method and classification trees. However, in most publications on this topic the term “local” is used without further explanation of its particular meaning. Only little is known about the properties of local methods and the types of classification problems for which they may be beneficial. We explain the basic principles and introduce the most important variants of local methods. To our knowledge there are very few extensive studies in the literature that compare several types of local methods and global methods across many data sets. In order to assess their performance we conduct a benchmark study on real-world and synthetic tasks. We cluster data sets and considered learning algorithms with regard to the obtained performance structures and try to relate our theoretical considerations and intuitions to these results. We also address some general issues of benchmark studies and cover some pitfalls, extensions and improvements.  相似文献   

13.
《Fuzzy Sets and Systems》2004,141(2):203-217
In this paper, we introduce a new classification procedure for assigning objects to predefined classes, named PROCFTN. This procedure is based on a fuzzy scoring function for choosing a subset of prototypes, which represent the closest resemblance with an object to be assigned. It then applies the majority-voting rule to assign an object to a class. We also present a medical application of this procedure as an aid to assist the diagnosis of central nervous system tumours. The results are compared with those obtained by other classification methods, reported on the same data set, including decision tree, production rules, neural network, k nearest neighbor, multilayer perceptron and logistic regression. Our results are very encouraging and show that the multicriteria decision analysis approach can be successfully used to help medical diagnosis.  相似文献   

14.
Kernel methods and rough sets are two general pursuits in the domain of machine learning and intelligent systems. Kernel methods map data into a higher dimensional feature space, where the resulting structure of the classification task is linearly separable; while rough sets granulate the universe with the use of relations and employ the induced knowledge granules to approximate arbitrary concepts existing in the problem at hand. Although it seems there is no connection between these two methodologies, both kernel methods and rough sets explicitly or implicitly dwell on relation matrices to represent the structure of sample information. Based on this observation, we combine these methodologies by incorporating Gaussian kernel with fuzzy rough sets and propose a Gaussian kernel approximation based fuzzy rough set model. Fuzzy T-equivalence relations constitute the fundamentals of most fuzzy rough set models. It is proven that fuzzy relations with Gaussian kernel are reflexive, symmetric and transitive. Gaussian kernels are introduced to acquire fuzzy relations between samples described by fuzzy or numeric attributes in order to carry out fuzzy rough data analysis. Moreover, we discuss information entropy to evaluate the kernel matrix and calculate the uncertainty of the approximation. Several functions are constructed for evaluating the significance of features based on kernel approximation and fuzzy entropy. Algorithms for feature ranking and reduction based on the proposed functions are designed. Results of experimental analysis are included to quantify the effectiveness of the proposed methods.  相似文献   

15.
A fuzzy random forest   总被引:4,自引:0,他引:4  
When individual classifiers are combined appropriately, a statistically significant increase in classification accuracy is usually obtained. Multiple classifier systems are the result of combining several individual classifiers. Following Breiman’s methodology, in this paper a multiple classifier system based on a “forest” of fuzzy decision trees, i.e., a fuzzy random forest, is proposed. This approach combines the robustness of multiple classifier systems, the power of the randomness to increase the diversity of the trees, and the flexibility of fuzzy logic and fuzzy sets for imperfect data management. Various combination methods to obtain the final decision of the multiple classifier system are proposed and compared. Some of them are weighted combination methods which make a weighting of the decisions of the different elements of the multiple classifier system (leaves or trees). A comparative study with several datasets is made to show the efficiency of the proposed multiple classifier system and the various combination methods. The proposed multiple classifier system exhibits a good accuracy classification, comparable to that of the best classifiers when tested with conventional data sets. However, unlike other classifiers, the proposed classifier provides a similar accuracy when tested with imperfect datasets (with missing and fuzzy values) and with datasets with noise.  相似文献   

16.
In recent years, several methods have been proposed to deal with functional data classification problems (e.g., one-dimensional curves or two- or three-dimensional images). One popular general approach is based on the kernel-based method, proposed by Ferraty and Vieu (Comput Stat Data Anal 44:161–173, 2003). The performance of this general method depends heavily on the choice of the semi-metric. Motivated by Fan and Lin (J Am Stat Assoc 93:1007–1021, 1998) and our image data, we propose a new semi-metric, based on wavelet thresholding for classifying functional data. This wavelet-thresholding semi-metric is able to adapt to the smoothness of the data and provides for particularly good classification when data features are localized and/or sparse. We conduct simulation studies to compare our proposed method with several functional classification methods and study the relative performance of the methods for classifying positron emission tomography images.  相似文献   

17.
Many learning problems are described by a risk functional which in turn is defined by a loss function, and a straightforward and widely known approach to learn such problems is to minimize a (modified) empirical version of this risk functional. However, in many cases this approach suffers from substantial problems such as computational requirements in classification or robustness concerns in regression. In order to resolve these issues many successful learning algorithms try to minimize a (modified) empirical risk of a surrogate loss function, instead. Of course, such a surrogate loss must be "reasonably related" to the original loss function since otherwise this approach cannot work well. For classification good surrogate loss functions have been recently identified, and the relationship between the excess classification risk and the excess risk of these surrogate loss functions has been exactly described. However, beyond the classification problem little is known on good surrogate loss functions up to now. In this work we establish a general theory that provides powerful tools for comparing excess risks of different loss functions. We then apply this theory to several learning problems including (cost-sensitive) classification, regression, density estimation, and density level detection.  相似文献   

18.
The proportion exponent is introduced as a measure of the validity of the clustering obtained for a data set using a fuzzy clustering algorithm. It is assumed that the output of an algorithm includes a fuzzy nembership function for each data point. We show how to compute the proportion of possible memberships whose maximum entry exceeds the maximum entry of a given membership function, and use these proportions to define the proportion exponent. Its use as a validity functional is illustrated with four numerical examples and its effectiveness compared to other validity functionals, namely, classification entropy and partition coefficient.  相似文献   

19.
The support vector machine (SVM) is known for its good performance in two-class classification, but its extension to multiclass classification is still an ongoing research issue. In this article, we propose a new approach for classification, called the import vector machine (IVM), which is built on kernel logistic regression (KLR). We show that the IVM not only performs as well as the SVM in two-class classification, but also can naturally be generalized to the multiclass case. Furthermore, the IVM provides an estimate of the underlying probability. Similar to the support points of the SVM, the IVM model uses only a fraction of the training data to index kernel basis functions, typically a much smaller fraction than the SVM. This gives the IVM a potential computational advantage over the SVM.  相似文献   

20.
Local search methods are widely used to improve the performance of evolutionary computation algorithms in all kinds of domains. Employing advanced and efficient exploration mechanisms becomes crucial in complex and very large (in terms of search space) problems, such as when employing evolutionary algorithms to large-scale data mining tasks. Recently, the GAssist Pittsburgh evolutionary learning system was extended with memetic operators for discrete representations that use information from the supervised learning process to heuristically edit classification rules and rule sets. In this paper we first adapt some of these operators to BioHEL, a different evolutionary learning system applying the iterative learning approach, and afterwards propose versions of these operators designed for continuous attributes and for dealing with noise. The performance of all these operators and their combination is extensively evaluated on a broad range of synthetic large-scale datasets to identify the settings that present the best balance between efficiency and accuracy. Finally, the identified best configurations are compared with other classes of machine learning methods on both synthetic and real-world large-scale datasets and show very competent performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号