首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The $k$ -Nearest Neighbour classifier is widely used and popular due to its inherent simplicity and the avoidance of model assumptions. Although the approach has been shown to yield a near-optimal classification performance for an infinite number of samples, a selection of the most decisive data points can improve the classification accuracy considerably in real settings with a limited number of samples. At the same time, a selection of a subset of representative training samples reduces the required amount of storage and computational resources. We devised a new approach that selects a representative training subset on the basis of an evolutionary optimization procedure. This method chooses those training samples that have a strong influence on the correct prediction of other training samples, in particular those that have uncertain labels. The performance of the algorithm is evaluated on different data sets. Additionally, we provide graphical examples of the selection procedure.  相似文献   

2.
Credal nets are probabilistic graphical models which extend Bayesian nets to cope with sets of distributions. An algorithm for approximate credal network updating is presented. The problem in its general formulation is a multilinear optimization task, which can be linearized by an appropriate rule for fixing all the local models apart from those of a single variable. This simple idea can be iterated and quickly leads to accurate inferences. A transformation is also derived to reduce decision making in credal networks based on the maximality criterion to updating. The decision task is proved to have the same complexity of standard inference, being NPPP-complete for general credal nets and NP-complete for polytrees. Similar results are derived for the E-admissibility criterion. Numerical experiments confirm a good performance of the method.  相似文献   

3.
Supervised classification learning can be considered as an important tool for decision support. In this paper, we present a method for supervised classification learning, which ensembles decision trees obtained via convex sets of probability distributions (also called credal sets) and uncertainty measures. Our method forces the use of different decision trees and it has mainly the following characteristics: it obtains a good percentage of correct classifications and an improvement in time of processing compared with known classification methods; it not needs to fix the number of decision trees to be used; and it can be parallelized to apply it on very large data sets.  相似文献   

4.
Bayesian networks (BNs) provide a powerful graphical model for encoding the probabilistic relationships among a set of variables, and hence can naturally be used for classification. However, Bayesian network classifiers (BNCs) learned in the common way using likelihood scores usually tend to achieve only mediocre classification accuracy because these scores are less specific to classification, but rather suit a general inference problem. We propose risk minimization by cross validation (RMCV) using the 0/1 loss function, which is a classification-oriented score for unrestricted BNCs. RMCV is an extension of classification-oriented scores commonly used in learning restricted BNCs and non-BN classifiers. Using small real and synthetic problems, allowing for learning all possible graphs, we empirically demonstrate RMCV superiority to marginal and class-conditional likelihood-based scores with respect to classification accuracy. Experiments using twenty-two real-world datasets show that BNCs learned using an RMCV-based algorithm significantly outperform the naive Bayesian classifier (NBC), tree augmented NBC (TAN), and other BNCs learned using marginal or conditional likelihood scores and are on par with non-BN state of the art classifiers, such as support vector machine, neural network, and classification tree. These experiments also show that an optimized version of RMCV is faster than all unrestricted BNCs and comparable with the neural network with respect to run-time. The main conclusion from our experiments is that unrestricted BNCs, when learned properly, can be a good alternative to restricted BNCs and traditional machine-learning classifiers with respect to both accuracy and efficiency.  相似文献   

5.
The investigation of how the influential affect the metrics and predictivity of multiple linear regressions on a set of phenolic compounds with toxicity on Tetrahymena pyriformis is presented. The investigation of influential was conducted using standardized residuals (ri-model) and Cook's distance (Di-model) approaches. The applied approaches let to improvement of model's metrics, robustness and accuracy on the investigated sample. Overall, the ri-model proved higher accuracy and robustness in terms of sensitivity while Di-model proved robustness in terms of specificity. Characterization of the withdrawn compounds is essential for advance in developing models for the toxicity of phenols.  相似文献   

6.
Plane autonomous state classifiers are defined and characterized. The nonempty class of C-systems is proved to be contained in the class of plane autonomous state classifiers. Plane autonomous state classifiers are considered as the generalization, to nonlinear systems, of the concept of saddle point.  相似文献   

7.
A family of classification algorithms generated from Tikhonov regularization schemes are considered. They involve multi-kernel spaces and general convex loss functions. Our main purpose is to provide satisfactory estimates for the excess misclassification error of these multi-kernel regularized classifiers when the loss functions achieve the zero value. The error analysis consists of two parts: regularization error and sample error. Allowing multi-kernels in the algorithm improves the regularization error and approximation error, which is one advantage of the multi-kernel setting. For a general loss function, we show how to bound the regularization error by the approximation in some weighted LqLq spaces. For the sample error, we use a projection operator. The projection in connection with the decay of the regularization error enables us to improve convergence rates in the literature even for the one-kernel schemes and special loss functions: least-square loss and hinge loss for support vector machine soft margin classifiers. Existence of the optimization problem for the regularization scheme associated with multi-kernels is verified when the kernel functions are continuous with respect to the index set. Concrete examples, including Gaussian kernels with flexible variances and probability distributions with some noise conditions, are used to illustrate the general theory.  相似文献   

8.
In this research, a robust optimization approach applied to multiclass support vector machines (SVMs) is investigated. Two new kernel based-methods are developed to address data with input uncertainty where each data point is inside a sphere of uncertainty. The models are called robust SVM and robust feasibility approach model (Robust-FA) respectively. The two models are compared in terms of robustness and generalization error. The models are compared to robust Minimax Probability Machine (MPM) in terms of generalization behavior for several data sets. It is shown that the Robust-SVM performs better than robust MPM.  相似文献   

9.
This paper proposes two new algorithms for inference in credal networks. These algorithms enable probability intervals to be obtained for the states of a given query variable. The first algorithm is approximate and uses the hill-climbing technique in the Shenoy–Shafer architecture to propagate in join trees; the second is exact and is a modification of Rocha and Cozman’s branch-and-bound algorithm, but applied to general directed acyclic graphs.  相似文献   

10.
We focus on credal nets, which are graphical models that generalise Bayesian nets to imprecise probability. We replace the notion of strong independence commonly used in credal nets with the weaker notion of epistemic irrelevance, which is arguably more suited for a behavioural theory of probability. Focusing on directed trees, we show how to combine the given local uncertainty models in the nodes of the graph into a global model, and we use this to construct and justify an exact message-passing algorithm that computes updated beliefs for a variable in the tree. The algorithm, which is linear in the number of nodes, is formulated entirely in terms of coherent lower previsions, and is shown to satisfy a number of rationality requirements. We supply examples of the algorithm’s operation, and report an application to on-line character recognition that illustrates the advantages of our approach for prediction. We comment on the perspectives, opened by the availability, for the first time, of a truly efficient algorithm based on epistemic irrelevance.  相似文献   

11.
We propose two methods for tuning membership functions of a kernel fuzzy classifier based on the idea of SVM (support vector machine) training. We assume that in a kernel fuzzy classifier a fuzzy rule is defined for each class in the feature space. In the first method, we tune the slopes of the membership functions at the same time so that the margin between classes is maximized under the constraints that the degree of membership to which a data sample belongs is the maximum among all the classes. This method is similar to a linear all-at-once SVM. We call this AAO tuning. In the second method, we tune the membership function of a class one at a time. Namely, for a class the slope of the associated membership function is tuned so that the margin between the class and the remaining classes is maximized under the constraints that the degrees of membership for the data belonging to the class are large and those for the remaining data are small. This method is similar to a linear one-against-all SVM. This is called OAA tuning. According to the computer experiment for fuzzy classifiers based on kernel discriminant analysis and those with ellipsoidal regions, usually both methods improve classification performance by tuning membership functions and classification performance by AAO tuning is slightly better than that by OAA tuning.  相似文献   

12.
Fuzzy rough sets, generalized from Pawlak's rough sets, were introduced for dealing with continuous or fuzzy data. This model has been widely discussed and applied these years. It is shown that the model of fuzzy rough sets is sensitive to noisy samples, especially sensitive to mislabeled samples. As data are usually contaminated with noise in practice, a robust model is desirable. We introduce a new model of fuzzy rough set model, called soft fuzzy rough sets, and design a robust classification algorithm based on the model. Experimental results show the effectiveness of the proposed algorithm.  相似文献   

13.
Advances in Data Analysis and Classification - We obtain a decomposition of any quadratic classifier in terms of products of hyperplanes. These hyperplanes can be viewed as relevant linear...  相似文献   

14.
Credal networks generalize Bayesian networks by relaxing the requirement of precision of probabilities. Credal networks are considerably more expressive than Bayesian networks, but this makes belief updating NP-hard even on polytrees. We develop a new efficient algorithm for approximate belief updating in credal networks. The algorithm is based on an important representation result we prove for general credal networks: that any credal network can be equivalently reformulated as a credal network with binary variables; moreover, the transformation, which is considerably more complex than in the Bayesian case, can be implemented in polynomial time. The equivalent binary credal network is then updated by L2U, a loopy approximate algorithm for binary credal networks. Overall, we generalize L2U to non-binary credal networks, obtaining a scalable algorithm for the general case, which is approximate only because of its loopy nature. The accuracy of the inferences with respect to other state-of-the-art algorithms is evaluated by extensive numerical tests.  相似文献   

15.
Streaming data are relevant to finance, computer science, and engineering while they are becoming increasingly important to medicine and biology. Continuous time Bayesian network classifiers are designed for analyzing multivariate streaming data when time duration of event matters. Structural and parametric learning for the class of continuous time Bayesian network classifiers are considered in the case where complete data is available. Conditional log-likelihood scoring is developed for structural learning on continuous time Bayesian network classifiers. Performance of continuous time Bayesian network classifiers learned when combining conditional log-likelihood scoring and Bayesian parameter estimation are compared with that achieved by continuous time Bayesian network classifiers when learning is based on marginal log-likelihood scoring and to that achieved by dynamic Bayesian network classifiers. Classifiers are compared in terms of accuracy and computation time. Comparison is based on numerical experiments where synthetic and real data are used. Results show that conditional log-likelihood scoring combined with Bayesian parameter estimation outperforms marginal log-likelihood scoring. Conditional log-likelihood scoring becomes even more effective when the amount of available data is limited. Continuous time Bayesian network classifiers outperform in terms of computation time and accuracy dynamic Bayesian network on synthetic and real data sets.  相似文献   

16.
In machine learning problems, the availability of several classifiers trained on different data or features makes the combination of pattern classifiers of great interest. To combine distinct sources of information, it is necessary to represent the outputs of classifiers in a common space via a transformation called calibration. The most classical way is to use class membership probabilities. However, using a single probability measure may be insufficient to model the uncertainty induced by the calibration step, especially in the case of few training data. In this paper, we extend classical probabilistic calibration methods to the evidential framework. Experimental results from the calibration of SVM classifiers show the interest of using belief functions in classification problems.  相似文献   

17.
Learning from imbalanced data, where the number of observations in one class is significantly larger than the ones in the other class, has gained considerable attention in the machine learning community. Assuming the difficulty in predicting each class is similar, most standard classifiers will tend to predict the majority class well. This study applies tornado data that are highly imbalanced, as they are rare events. The severe weather data used herein have thunderstorm circulations (mesocyclones) that produce tornadoes in approximately 6.7 % of the total number of observations. However, since tornadoes are high impact weather events, it is important to predict the minority class with high accuracy. In this study, we apply support vector machines (SVMs) and logistic regression with and without a midpoint threshold adjustment on the probabilistic outputs, random forest, and rotation forest for tornado prediction. Feature selection with SVM-recursive feature elimination was also performed to identify the most important features or variables for predicting tornadoes. The results showed that the threshold adjustment on SVMs provided better performance compared to other classifiers.  相似文献   

18.
This paper considers the problem of learning multinomial distributions from a sample of independent observations. The Bayesian approach usually assumes a prior Dirichlet distribution about the probabilities of the different possible values. However, there is no consensus on the parameters of this Dirichlet distribution. Here, it will be shown that this is not a simple problem, providing examples in which different selection criteria are reasonable. To solve it the Imprecise Dirichlet Model (IDM) was introduced. But this model has important drawbacks, as the problems associated to learning from indirect observations. As an alternative approach, the Imprecise Sample Size Dirichlet Model (ISSDM) is introduced and its properties are studied. The prior distribution over the parameters of a multinomial distribution is the basis to learn Bayesian networks using Bayesian scores. Here, we will show that the ISSDM can be used to learn imprecise Bayesian networks, also called credal networks when all the distributions share a common graphical structure. Some experiments are reported on the use of the ISSDM to learn the structure of a graphical model and to build supervised classifiers.  相似文献   

19.
This paper proposes a variant of the generalized learning vector quantizer (GLVQ) optimizing explicitly the area under the receiver operating characteristics (ROC) curve for binary classification problems instead of the classification accuracy, which is frequently not appropriate for classifier evaluation. This is particularly important in case of overlapping class distributions, when the user has to decide about the trade-off between high true-positive and good false-positive performance. The model keeps the idea of learning vector quantization based on prototypes by stochastic gradient descent learning. For this purpose, a GLVQ-based cost function is presented, which describes the area under the ROC-curve in terms of the sum of local discriminant functions. This cost function reflects the underlying rank statistics in ROC analysis being involved into the design of the prototype based discriminant function. The resulting learning scheme for the prototype vectors uses structured inputs, i.e. ordered pairs of data vectors of both classes.  相似文献   

20.
This paper presents an error analysis for classification algorithms generated by regularization schemes with polynomial kernels. Explicit convergence rates are provided for support vector machine (SVM) soft margin classifiers. The misclassification error can be estimated by the sum of sample error and regularization error. The main difficulty for studying algorithms with polynomial kernels is the regularization error which involves deeply the degrees of the kernel polynomials. Here we overcome this difficulty by bounding the reproducing kernel Hilbert space norm of Durrmeyer operators, and estimating the rate of approximation by Durrmeyer operators in a weighted L1 space (the weight is a probability distribution). Our study shows that the regularization parameter should decrease exponentially fast with the sample size, which is a special feature of polynomial kernels. Dedicated to Charlie Micchelli on the occasion of his 60th birthday Mathematics subject classifications (2000) 68T05, 62J02. Ding-Xuan Zhou: The first author is supported partially by the Research Grants Council of Hong Kong (Project No. CityU 103704).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号