首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A bootstrap-based aggregate classifier for model-based clustering   总被引:1,自引:0,他引:1  
In model-based clustering, a situation in which true class labels are unknown and that is therefore also referred to as unsupervised learning, observations are typically classified by the Bayes modal rule. In this study, we assess whether alternative classifiers from the classification or supervised-learning literature—developed for situations in which class labels are known—can improve the Bayes rule. More specifically, we investigate the performance of bootstrap-based aggregate (bagging) rules after adapting these to the model-based clustering context. It is argued that specific issues, such as the label-switching problem, have to be carefully addressed when using bootstrap methods in model-based clustering. Our two Monte Carlo studies show that classification based on the Bayes rule is rather stable and difficult to improve by bootstrap-based aggregate rules, even for sparse data. An empirical example illustrates the various approaches described in this paper.  相似文献   

2.
Mathematical programming (MP) discriminant analysis models can be used to develop classification models for assigning observations of unknown class membership to one of a number of specified classes using values of a set of features associated with each observation. Since most MP discriminant analysis models generate linear discriminant functions, these MP models are generally used to develop linear classification models. Nonlinear classifiers may, however, have better classification performance than linear classifiers. In this paper, a mixed integer programming model is developed to generate nonlinear discriminant functions composed of monotone piecewise-linear marginal utility functions for each feature and the cut-off value for class membership. It is also shown that this model can be extended for feature selection. The performance of this new MP model for two-group discriminant analysis is compared with statistical discriminant analysis and other MP discriminant analysis models using a real problem and a number of simulated problem sets.  相似文献   

3.
This article concludes the examination of reliability estimates of classification algorithms. It reviews the statistical methods used for interval estimation of the reliability of classifiers in the frequency and Bayesian approaches. “Hybrid” estimates combining both approaches are also considered. These estimates are particularly important as they are applicable to the small-sample case.__________Translated from Prikladnaya Matematika i Informatika, No. 17, pp. 112 – 128, 2004.  相似文献   

4.
In machine learning problems, the availability of several classifiers trained on different data or features makes the combination of pattern classifiers of great interest. To combine distinct sources of information, it is necessary to represent the outputs of classifiers in a common space via a transformation called calibration. The most classical way is to use class membership probabilities. However, using a single probability measure may be insufficient to model the uncertainty induced by the calibration step, especially in the case of few training data. In this paper, we extend classical probabilistic calibration methods to the evidential framework. Experimental results from the calibration of SVM classifiers show the interest of using belief functions in classification problems.  相似文献   

5.
Multi-label classification problems require each instance to be assigned a subset of a defined set of labels. This problem is equivalent to finding a multi-valued decision function that predicts a vector of binary classes. In this paper we study the decision boundaries of two widely used approaches for building multi-label classifiers, when Bayesian network-augmented naive Bayes classifiers are used as base models: Binary relevance method and chain classifiers. In particular extending previous single-label results to multi-label chain classifiers, we find polynomial expressions for the multi-valued decision functions associated with these methods. We prove upper boundings on the expressive power of both methods and we prove that chain classifiers provide a more expressive model than the binary relevance method.  相似文献   

6.
The support vector machine (SVM) is one of the most popular classification methods in the machine learning literature. Binary SVM methods have been extensively studied, and have achieved many successes in various disciplines. However, generalization to multicategory SVM (MSVM) methods can be very challenging. Many existing methods estimate k functions for k classes with an explicit sum-to-zero constraint. It was shown recently that such a formulation can be suboptimal. Moreover, many existing MSVMs are not Fisher consistent, or do not take into account the effect of outliers. In this paper, we focus on classification in the angle-based framework, which is free of the explicit sum-to-zero constraint, hence more efficient, and propose two robust MSVM methods using truncated hinge loss functions. We show that our new classifiers can enjoy Fisher consistency, and simultaneously alleviate the impact of outliers to achieve more stable classification performance. To implement our proposed classifiers, we employ the difference convex algorithm for efficient computation. Theoretical and numerical results obtained indicate that for problems with potential outliers, our robust angle-based MSVMs can be very competitive among existing methods.  相似文献   

7.
We consider weighted o-minimal hybrid systems, which extend classical o-minimal hybrid systems with cost functions. These cost functions are “observer variables” which increase while the system evolves but do not constrain the behaviour of the system. In this paper, we prove two main results: (i) optimal o-minimal hybrid games are decidable; (ii) the model-checking of WCTL, an extension of CTL which can constrain the cost variables, is decidable over that model. This has to be compared with the same problems in the framework of timed automata where both problems are undecidable in general, while they are decidable for the restricted class of one-clock timed automata.  相似文献   

8.
The problem of recognition (classification) by precedents is considered. Issues of improving the recognition ability and the training rate of logical correctors, i.e., the recognition procedures based on the construction of correct sets of elementary classifiers, are studied. The concept of a correct set of generic elementary classifiers is introduced and used to construct and investigate a qualitatively new model of the logical corrector. This model uses a wider class of correcting functions than in the earlier constructed models of logical correctors.  相似文献   

9.
Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample accuracy. The selected classifiers are then combined sequentially starting from the best model and assessed for collective performance on a validation data set. We use bench mark data sets with their original and some added non-informative features for the evaluation of our method. The results are compared with usual kNN, bagged kNN, random kNN, multiple feature subset method, random forest and support vector machines. Our experimental comparisons on benchmark classification problems and simulated data sets reveal that the proposed ensemble gives better classification performance than the usual kNN and its ensembles, and performs comparable to random forest and support vector machines.  相似文献   

10.
In formal scattering theory, the Green functions are obtained as solutions of a distributional equation. In this paper, we use the Sturm–Liouville theory to compute the Green functions within a rigorous mathematical theory. We shall show that both the Sturm–Liouville theory and the formal treatment yield the same Green functions. We shall also show how the analyticity of the Green functions as functions of the energy keeps track of the so-called “incoming” and “outgoing” boundary conditions.  相似文献   

11.
Hidden Markov models are used as tools for pattern recognition in a number of areas, ranging from speech processing to biological sequence analysis. Profile hidden Markov models represent a class of so-called “left–right” models that have an architecture that is specifically relevant to classification of proteins into structural families based on their amino acid sequences. Standard learning methods for such models employ a variety of heuristics applied to the expectation-maximization implementation of the maximum likelihood estimation procedure in order to find the global maximum of the likelihood function. Here, we compare maximum likelihood estimation to fully Bayesian estimation of parameters for profile hidden Markov models with a small number of parameters. We find that, relative to maximum likelihood methods, Bayesian methods assign higher scores to data sequences that are distantly related to the pattern consensus, show better performance in classifying these sequences correctly, and continue to perform robustly with regard to misspecification of the number of model parameters. Though our study is limited in scope, we expect our results to remain relevant for models with a large number of parameters and other types of left–right hidden Markov models.  相似文献   

12.
A unified presentation of classical clustering algorithms is proposed both for the hard and fuzzy pattern classification problems. Based on two types of objective functions, a new method is presented and compared with the procedures of Dunn and Ruspini. In order to determine the best, or more natural number of fuzzy clusters, two coefficients that measure the “degree of non-fuzziness” of the partition are proposed. Numerous computational results are shown.  相似文献   

13.
A model is developed for multivariate distributions which have nearly the same marginals, up to shift and scale. This model, based on “interpolation” of characteristic functions, gives a new notion of “correlation”. It allows straightforward nonparametric estimation of the common marginal distribution, which avoids the “curse of dimensionality” present when nonparametically estimating the full multivariate distribution. The method is illustrated with environmental monitoring network data, where multivariate modelling with common marginals is often appropriate.  相似文献   

14.
Multi-dimensional classification aims at finding a function that assigns a vector of class values to a given vector of features. In this paper, this problem is tackled by a general family of models, called multi-dimensional Bayesian network classifiers (MBCs). This probabilistic graphical model organizes class and feature variables as three different subgraphs: class subgraph, feature subgraph, and bridge (from class to features) subgraph. Under the standard 0-1 loss function, the most probable explanation (MPE) must be computed, for which we provide theoretical results in both general MBCs and in MBCs decomposable into maximal connected components. Moreover, when computing the MPE, the vector of class values is covered by following a special ordering (gray code). Under other loss functions defined in accordance with a decomposable structure, we derive theoretical results on how to minimize the expected loss. Besides these inference issues, the paper presents flexible algorithms for learning MBC structures from data based on filter, wrapper and hybrid approaches. The cardinality of the search space is also given. New performance evaluation metrics adapted from the single-class setting are introduced. Experimental results with three benchmark data sets are encouraging, and they outperform state-of-the-art algorithms for multi-label classification.  相似文献   

15.
We extend some of the classical connections between automata and logic due to Büchi (1960) [5] and McNaughton and Papert (1971) [12] to languages of finitely varying functions or “signals”. In particular, we introduce a natural class of automata for generating finitely varying functions called ’s, and show that it coincides in terms of language definability with a natural monadic second-order logic interpreted over finitely varying functions Rabinovich (2002) [15]. We also identify a “counter-free” subclass of ’s which characterise the first-order definable languages of finitely varying functions. Our proofs mainly factor through the classical results for word languages. These results have applications in automata characterisations for continuously interpreted real-time logics like Metric Temporal Logic (MTL) Chevalier et al. (2006, 2007) [6] and [7].  相似文献   

16.
We propose two methods for tuning membership functions of a kernel fuzzy classifier based on the idea of SVM (support vector machine) training. We assume that in a kernel fuzzy classifier a fuzzy rule is defined for each class in the feature space. In the first method, we tune the slopes of the membership functions at the same time so that the margin between classes is maximized under the constraints that the degree of membership to which a data sample belongs is the maximum among all the classes. This method is similar to a linear all-at-once SVM. We call this AAO tuning. In the second method, we tune the membership function of a class one at a time. Namely, for a class the slope of the associated membership function is tuned so that the margin between the class and the remaining classes is maximized under the constraints that the degrees of membership for the data belonging to the class are large and those for the remaining data are small. This method is similar to a linear one-against-all SVM. This is called OAA tuning. According to the computer experiment for fuzzy classifiers based on kernel discriminant analysis and those with ellipsoidal regions, usually both methods improve classification performance by tuning membership functions and classification performance by AAO tuning is slightly better than that by OAA tuning.  相似文献   

17.
A functional distance \({\mathbb H}\), based on the Hausdorff metric between the function hypographs, is proposed for the space \({\mathcal E}\) of non-negative real upper semicontinuous functions on a compact interval. The main goal of the paper is to show that the space \(({\mathcal E},{\mathbb H})\) is particularly suitable in some statistical problems with functional data which involve functions with very wiggly graphs and narrow, sharp peaks. A typical example is given by spectrograms, either obtained by magnetic resonance or by mass spectrometry. On the theoretical side, we show that \(({\mathcal E},{\mathbb H})\) is a complete, separable locally compact space and that the \({\mathbb H}\)-convergence of a sequence of functions implies the convergence of the respective maximum values of these functions. The probabilistic and statistical implications of these results are discussed, in particular regarding the consistency of k-NN classifiers for supervised classification problems with functional data in \({\mathbb H}\). On the practical side, we provide the results of a small simulation study and check also the performance of our method in two real data problems of supervised classification involving mass spectra.  相似文献   

18.
In Bayesian analysis it is usual to assume that the risk profiles Θ1 and Θ2 associated with the random variables “number of claims” and “amount of a single claim”, respectively, are independent. A few studies have addressed a model of this nature assuming some degree of dependence between the two random variables (and most of these studies include copulas). In this paper, we focus on the collective and Bayes net premiums for the aggregate amount of claims under a compound model assuming some degree of dependence between the random variables Θ1 and Θ2. The degree of dependence is modelled using the Sarmanov–Lee family of distributions [Sarmanov, O.V., 1966. Generalized normal correlation and two-dimensional Frechet classes. Doklady (Soviet Mathematics) 168, 596–599 and Ting-Lee, M.L., 1996. Properties and applications of the Sarmanov family of bivariate distributions. Communications Statistics: Theory and Methods 25 (6) 1207–1222], which allows us to study the impact of this assumption on the collective and Bayes net premiums. The results obtained show that a low degree of correlation produces Bayes premiums that are highly sensitive.  相似文献   

19.
Large “O” and small “o” approximations of the expected value of a class of smooth functions (f Cr(R)) of the normalized partial sums of dependent random variable by the expectation of the corresponding functions of normal random variables have been established. The same types of approximations are also obtained for dependent random vectors. The technique used is the Lindberg-Levy method generalized by Dvoretzky to dependent random variables.  相似文献   

20.
In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号