首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Mathematical programming (MP) discriminant analysis models are widely used to generate linear discriminant functions that can be adopted as classification models. Nonlinear classification models may have better classification performance than linear classifiers, but although MP methods can be used to generate nonlinear discriminant functions, functions of specified form must be evaluated separately. Piecewise-linear functions can approximate nonlinear functions, and two new MP methods for generating piecewise-linear discriminant functions are developed in this paper. The first method uses maximization of classification accuracy (MCA) as the objective, while the second uses an approach based on minimization of the sum of deviations (MSD). The use of these new MP models is illustrated in an application to a test problem and the results are compared with those from standard MCA and MSD models.  相似文献   

2.
We propose two methods for tuning membership functions of a kernel fuzzy classifier based on the idea of SVM (support vector machine) training. We assume that in a kernel fuzzy classifier a fuzzy rule is defined for each class in the feature space. In the first method, we tune the slopes of the membership functions at the same time so that the margin between classes is maximized under the constraints that the degree of membership to which a data sample belongs is the maximum among all the classes. This method is similar to a linear all-at-once SVM. We call this AAO tuning. In the second method, we tune the membership function of a class one at a time. Namely, for a class the slope of the associated membership function is tuned so that the margin between the class and the remaining classes is maximized under the constraints that the degrees of membership for the data belonging to the class are large and those for the remaining data are small. This method is similar to a linear one-against-all SVM. This is called OAA tuning. According to the computer experiment for fuzzy classifiers based on kernel discriminant analysis and those with ellipsoidal regions, usually both methods improve classification performance by tuning membership functions and classification performance by AAO tuning is slightly better than that by OAA tuning.  相似文献   

3.
The clusterwise regression model is used to perform cluster analysis within a regression framework. While the traditional regression model assumes the regression coefficient (β) to be identical for all subjects in the sample, the clusterwise regression model allows β to vary with subjects of different clusters. Since the cluster membership is unknown, the estimation of the clusterwise regression is a tough combinatorial optimization problem. In this research, we propose a “Generalized Clusterwise Regression Model” which is formulated as a mathematical programming (MP) problem. A nonlinear programming procedure (with linear constraints) is proposed to solve the combinatorial problem and to estimate the cluster membership and β simultaneously. Moreover, by integrating the cluster analysis with the discriminant analysis, a clusterwise discriminant model is developed to incorporate parameter heterogeneity into the traditional discriminant analysis. The cluster membership and discriminant parameters are estimated simultaneously by another nonlinear programming model.  相似文献   

4.
In developing a classification model for assigning observations of unknown class to one of a number of specified classes using the values of a set of features associated with each observation, it is often desirable to base the classifier on a limited number of features. Mathematical programming discriminant analysis methods for developing classification models can be extended for feature selection. Classification accuracy can be used as the feature selection criterion by using a mixed integer programming (MIP) model in which a binary variable is associated with each training sample observation, but the binary variable requirements limit the size of problems to which this approach can be applied. Heuristic feature selection methods for problems with large numbers of observations are developed in this paper. These heuristic procedures, which are based on the MIP model for maximizing classification accuracy, are then applied to three credit scoring data sets.  相似文献   

5.
Research on mathematical programming approaches to the classification problem has focused almost exclusively on linear discriminant functions with only first-order terms. While many of these first-order models have displayed excellent classificatory performance when compared to Fisher's linear discriminant method, they cannot compete with Smith's quadratic discriminant method on certain data sets. In this paper, we investigate the appropriateness of including second-order terms in mathematical programming models. Various issues are addressed, such as performance of models with small to moderate sample size, need for crossproduct terms, and loss of power by the mathematical programming models under conditions ideal for the parametric procedures. A simulation study is conducted to assess the relative performance of first-order and second-order mathematical programming models to the parametric procedures. The simulation study indicates that mathematical programming models using polynomial functions may be prone to overfitting on the training samples which in turn may cause rather poor fits on the validation samples. The simulation study also indicates that inclusion of cross-product terms may hurt a polynomial model's accuracy on the validation samples, although omission of them means that the model is not invariant to nonsingular transformations of the data.  相似文献   

6.
Keiji Tatsumi  Tetsuzo Tanino 《TOP》2014,22(3):815-840
Machine learning is a very interesting and important branch of artificial intelligence. Among many learning models, the support vector machine is a popular model with high classification ability which can be trained by mathematical programming methods. Since the model was originally formulated for binary classification, various kinds of extensions have been investigated for multi-class classification. In this paper, we review some existing models, and introduce new models which we recently proposed. The models are derived from the viewpoint of multi-objective maximization of geometric margins for a discriminant function, and each model can be trained by solving a second-order cone programming problem. We show that discriminant functions with high generalization ability can be obtained by these models through some numerical experiments.  相似文献   

7.
In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.  相似文献   

8.
In this paper, we show that quantum logic of linear subspaces can be used for recognition of random signals by a Bayesian energy discriminant classifier. The energy distribution on linear subspaces is described by the correlation matrix of the probability distribution. We show that the correlation matrix corresponds to von Neumann density matrix in quantum theory. We suggest the interpretation of quantum logic as a fuzzy logic of fuzzy sets. The use of quantum logic for recognition is based on the fact that the probability distribution of each class lies approximately in a lower-dimensional subspace of feature space. We offer the interpretation of discriminant functions as membership functions of fuzzy sets. Also, we offer the quality functional for optimal choice of discriminant functions for recognition from some class of discriminant functions.  相似文献   

9.
This paper presents an analysis of credit rating using fuzzy rule-based systems. The disadvantage of the models used in previous studies is that it is difficult to extract understandable knowledge from them. The root of this problem is the use of natural language that is typical for the credit rating process. This problem can be solved using fuzzy logic, which enables users to model the meaning of natural language words. Therefore, the fuzzy rule-based system adapted by a feed-forward neural network is designed to classify US companies (divided into the finance, manufacturing, mining, retail trade, services, and transportation industries) and municipalities into the credit rating classes obtained from rating agencies. Features are selected using a filter combined with a genetic algorithm as a search method. The resulting subsets of features confirm the assumption that the rating process is industry-specific (i.e. specific determinants are used for each industry). The results show that the credit rating classes assigned to bond issuers can be classified with high classification accuracy using low numbers of features, membership functions, and if-then rules. The comparison of selected fuzzy rule-based classifiers indicates that it is possible to increase classification performance by using different classifiers for individual industries.  相似文献   

10.
Classification is a main data mining task, which aims at predicting the class label of new input data on the basis of a set of pre-classified samples. Multiple criteria linear programming (MCLP) is used as a classification method in the data mining area, which can separate two or more classes by finding a discriminate hyperplane. Although MCLP shows good performance in dealing with linear separable data, it is no longer applicable when facing with nonlinear separable problems. A kernel-based multiple criteria linear programming (KMCLP) model is developed to solve nonlinear separable problems. In this method, a kernel function is introduced to project the data into a higher-dimensional space in which the data will have more chance to be linear separable. KMCLP performs well in some real applications. However, just as other prevalent data mining classifiers, MCLP and KMCLP learn only from training examples. In the traditional machine learning area, there are also classification tasks in which data sets are classified only by prior knowledge, i.e. expert systems. Some works combine the above two classification principles to overcome the faults of each approach. In this paper, we provide our recent works which combine the prior knowledge and the MCLP or KMCLP model to solve the problem when the input consists of not only training examples, but also prior knowledge. Specifically, how to deal with linear and nonlinear knowledge in MCLP and KMCLP models is the main concern of this paper. Numerical tests on the above models indicate that these models are effective in classifying data with prior knowledge.  相似文献   

11.
A new non-parametric method is recently proposed for discriminant analysis (Sueyoshi, T., 1999. DEA-discriminant analysis in the view of goal programming. European Journal of Operational Research 115, 564–582). The new approach is referred to as “DEA-Discriminant Analysis (DEA-DA)” that is designed to identify the existence of an overlap between two groups, then determining the group membership of a newly sampled observation. A unique feature of the new technique is that it does not assume any discriminant function for group classification. As an extension of his study, this research proposes a new type of DEA-DA, or “Extended DEA-DA”, that can overcome some methodological drawbacks of its original formulation, but simultaneously maintaining its discriminant capabilities. Using a real data set regarding Japanese banks and a large simulation study, this research confirms that the Extended DEA-DA outperforms conventional linear and nonlinear discriminant analysis techniques.  相似文献   

12.
We propose a generic model for the “weighted voting” aggregation step performed by several methods in supervised classification. Further, we construct an algorithm to enumerate the number of distinct aggregate classifiers that arise in this model. When there are only two classes in the classification problem, we show that a class of functions that arises from aggregate classifiers coincides with the class of self-dual positive threshold Boolean functions.  相似文献   

13.
This research intends to develop the classifiers for dealing with binary classification problems with interval data whose difficulty to be tackled has been well recognized, regardless of the field. The proposed classifiers involve using the ideas and techniques of both quantiles and data envelopment analysis (DEA), and are thus referred to as quantile–DEA classifiers. That is, the classifiers first use the concept of quantiles to generate a desired number of exact-data sets from a training-data set comprising interval data. Then, the classifiers adopt the concept and technique of an intersection-form production possibility set in the DEA framework to construct acceptance domains with each corresponding to an exact-data set and thus a quantile. Here, an intersection-form acceptance domain is actually represented by a linear inequality system, which enables the quantile–DEA classifiers to efficiently discover the groups to which large volumes of data belong. In addition, the quantile feature enables the proposed classifiers not only to help reveal patterns, but also to tell the user the value or significance of these patterns.  相似文献   

14.
This article introduces a classification tree algorithm that can simultaneously reduce tree size, improve class prediction, and enhance data visualization. We accomplish this by fitting a bivariate linear discriminant model to the data in each node. Standard algorithms can produce fairly large tree structures because they employ a very simple node model, wherein the entire partition associated with a node is assigned to one class. We reduce the size of our trees by letting the discriminant models share part of the data complexity. Being themselves classifiers, the discriminant models can also help to improve prediction accuracy. Finally, because the discriminant models use only two predictor variables at a time, their effects are easily visualized by means of two-dimensional plots. Our algorithm does not simply fit discriminant models to the terminal nodes of a pruned tree, as this does not reduce the size of the tree. Instead, discriminant modeling is carried out in all phases of tree growth and the misclassification costs of the node models are explicitly used to prune the tree. Our algorithm is also distinct from the “linear combination split” algorithms that partition the data space with arbitrarily oriented hyperplanes. We use axis-orthogonal splits to preserve the interpretability of the tree structures. An extensive empirical study with real datasets shows that, in general, our algorithm has better prediction power than many other tree or nontree algorithms.  相似文献   

15.
Canonical Forest     
We propose a new classification ensemble method named Canonical Forest. The new method uses canonical linear discriminant analysis (CLDA) and bootstrapping to obtain accurate and diverse classifiers that constitute an ensemble. We note CLDA serves as a linear transformation tool rather than a dimension reduction tool. Since CLDA will find the transformed space that separates the classes farther in distribution, classifiers built on this space will be more accurate than those on the original space. To further facilitate the diversity of the classifiers in an ensemble, CLDA is applied only on a partial feature space for each bootstrapped data. To compare the performance of Canonical Forest and other widely used ensemble methods, we tested them on 29 real or artificial data sets. Canonical Forest performed significantly better in accuracy than other ensemble methods in most data sets. According to the investigation on the bias and variance decomposition, the success of Canonical Forest can be attributed to the variance reduction.  相似文献   

16.
17.
Feature selection consists of choosing a subset of available features that capture the relevant properties of the data. In supervised pattern classification, a good choice of features is fundamental for building compact and accurate classifiers. In this paper, we develop an efficient feature selection method using the zero-norm l 0 in the context of support vector machines (SVMs). Discontinuity at the origin for l 0 makes the solution of the corresponding optimization problem difficult to solve. To overcome this drawback, we use a robust DC (difference of convex functions) programming approach which is a general framework for non-convex continuous optimisation. We consider an appropriate continuous approximation to l 0 such that the resulting problem can be formulated as a DC program. Our DC algorithm (DCA) has a finite convergence and requires solving one linear program at each iteration. Computational experiments on standard datasets including challenging feature-selection problems of the NIPS 2003 feature selection challenge and gene selection for cancer classification show that the proposed method is promising: while it suppresses up to more than 99% of the features, it can provide a good classification. Moreover, the comparative results illustrate the superiority of the proposed approach over standard methods such as classical SVMs and feature selection concave.  相似文献   

18.
In this paper, two new algorithms are presented to solve multi-level multi-objective linear programming (ML-MOLP) problems through the fuzzy goal programming (FGP) approach. The membership functions for the defined fuzzy goals of all objective functions at all levels are developed in the model formulation of the problem; so also are the membership functions for vectors of fuzzy goals of the decision variables, controlled by decision makers at the top levels. Then the fuzzy goal programming approach is used to achieve the highest degree of each of the membership goals by minimizing their deviational variables and thereby obtain the most satisfactory solution for all decision makers.  相似文献   

19.
Multi-dimensional classification aims at finding a function that assigns a vector of class values to a given vector of features. In this paper, this problem is tackled by a general family of models, called multi-dimensional Bayesian network classifiers (MBCs). This probabilistic graphical model organizes class and feature variables as three different subgraphs: class subgraph, feature subgraph, and bridge (from class to features) subgraph. Under the standard 0-1 loss function, the most probable explanation (MPE) must be computed, for which we provide theoretical results in both general MBCs and in MBCs decomposable into maximal connected components. Moreover, when computing the MPE, the vector of class values is covered by following a special ordering (gray code). Under other loss functions defined in accordance with a decomposable structure, we derive theoretical results on how to minimize the expected loss. Besides these inference issues, the paper presents flexible algorithms for learning MBC structures from data based on filter, wrapper and hybrid approaches. The cardinality of the search space is also given. New performance evaluation metrics adapted from the single-class setting are introduced. Experimental results with three benchmark data sets are encouraging, and they outperform state-of-the-art algorithms for multi-label classification.  相似文献   

20.
Data envelopment analysis (DEA) is popularly used to evaluate relative efficiency among public or private firms. Most DEA models are established by individually maximizing each firm's efficiency according to its advantageous expectation by a ratio. Some scholars have pointed out the interesting relationship between the multiobjective linear programming (MOLP) problem and the DEA problem. They also introduced the common weight approach to DEA based on MOLP. This paper proposes a new linear programming problem for computing the efficiency of a decision-making unit (DMU). The proposed model differs from traditional and existing multiobjective DEA models in that its objective function is the difference between inputs and outputs instead of the outputs/inputs ratio. Then an MOLP problem, based on the introduced linear programming problem, is formulated for the computation of common weights for all DMUs. To be precise, the modified Chebychev distance and the ideal point of MOLP are used to generate common weights. The dual problem of this model is also investigated. Finally, this study presents an actual case study analysing R&D efficiency of 10 TFT-LCD companies in Taiwan to illustrate this new approach. Our model demonstrates better performance than the traditional DEA model as well as some of the most important existing multiobjective DEA models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号