期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Aggregating multiple classification results using fuzzy integration and stochastic feature selection 总被引：1，自引：0，他引：1

Nick J. Pizzi Witold Pedrycz 《International Journal of Approximate Reasoning》2010,51(8):883-894

Classifying magnetic resonance spectra is often difficult due to the curse of dimensionality; scenarios in which a high-dimensional feature space is coupled with a small sample size. We present an aggregation strategy that combines predicted disease states from multiple classifiers using several fuzzy integration variants. Rather than using all input features for each classifier, these multiple classifiers are presented with different, randomly selected, subsets of the spectral features. Results from a set of detailed experiments using this strategy are carefully compared against classification performance benchmarks. We empirically demonstrate that the aggregated predictions are consistently superior to the corresponding prediction from the best individual classifier. 相似文献

2.

Construction of classifier ensembles by means of artificial immune systems 总被引：2，自引：0，他引：2

Nicolás García-Pedrajas Colin Fyfe 《Journal of Heuristics》2008,14(3):285-310

This paper presents the application of Artificial Immune Systems to the design of classifier ensembles. Ensembles of classifiers are a very interesting alternative to single classifiers when facing difficult problems. In general, ensembles are able to achieve better performance in terms of learning and generalisation errors. Several papers have shown that the processes of classifier design and combination must be related in order to obtain better ensembles. Artificial Immune Systems are a recent paradigm based on the immune systems of animals. The features of this new paradigm make it very appropriate for the design of systems where many components must cooperate to solve a given task. The design of classifier ensembles can be considered within such a group of systems, as the cooperation of the individual classifiers is able to improve the performance of the overall system. This paper studies the viability of Artificial Immune Systems when dealing with ensemble design. We construct a population of classifiers that is evolved using an Artificial Immune algorithm. From this population of classifiers several different ensembles can be extracted. These ensembles are favourably compared with ensembles obtained using standard methods in 35 real-world classification problems from the UCI Machine Learning Repository. 相似文献

3.

A single interval based classifier

Heeyoung Kim Xiaoming Huo Jianjun Shi 《Annals of Operations Research》2014,216(1):307-325

In many applications, it is desirable to build a classifier that is bounded within an interval. Our motivating example is rooted in monitoring in a stamping process. A novel approach is proposed and examined in this paper. Our method consists of three stages: (1) A baseline of each class is estimated via convex optimization; (2) An “optimal interval” that maximizes the difference among the baselines is identified; (3) A classifier that is based on the “optimal interval” is constructed. We analyze the implementation strategy and properties of the derived algorithm. The derived classifier is named an interval based classifier (IBC) and can be computed via a low-order-of-complexity algorithm. Comparing to existing state-of-the-art classifiers, we illustrate the advantages of our approach. To showcase its usage in applications, we apply the IBC to a set of tonnage curves from stamping processes, and observed superior performance. This method can help identifying faulty situations in manufacturing. The computational steps of IBC take advantage of operations-research methodology. IBC can serve as a general data mining tool, when the features are based on single intervals. 相似文献

4.

Classification Modulo Invariance,With Application to Face Recognition

《Journal of computational and graphical statistics》2013,22(4):829-852

This article presents techniques for constructing classifiers that combine statistical information from training data with tangent approximations to known transformations; it demonstrates the techniques by applying them to a face recognition task. Our approach is to build Bayes classifiers with approximate class-conditional probability densities for measured data. The high dimension of the measurements in modern classification problems such as speech or image recognition makes inferring probability densities from feasibly sized training datasets difficult. We address the difficulty by imposing severely simplifying assumptions and exploiting a priori information about transformations to which classification should be invariant. For the face recognition task, we used a five-parameter group of such transformations consisting of rotation, shifts, and scalings. On the face recognition task, a classifier based on our techniques has an error rate that is 20% lower than that of the best algorithm in a reference software distribution. 相似文献

5.

A Bayesian framework for the combination of classifier outputs

H Zhu P A Beling G A Overstreet 《The Journal of the Operational Research Society》2002,53(7):719-727

We explore a Bayesian framework for constructing combinations of classifier outputs, as a means to improving overall classification results. We propose a sequential Bayesian framework to estimate the posterior probability of being in a certain class given multiple classifiers. This framework, which employs meta-Gaussian modelling but makes no assumptions about the distribution of classifier outputs, allows us to capture nonlinear dependencies between the combined classifiers and individuals. An important property of our method is that it produces a combined classifier that dominates the individuals upon which it is based in terms of Bayes risk, error rate, and receiver operating characteristic (ROC) curve. To illustrate the method, we show empirical results from the combination of credit scores generated from four different scoring models. 相似文献

6.

The impact of diversity on the accuracy of evidential classifier ensembles

Yaxin Bi 《International Journal of Approximate Reasoning》2012,53(4):584-607

Diversity being inherent in classifiers is widely acknowledged as an important issue in constructing successful classifier ensembles. Although many statistics have been employed in measuring diversity among classifiers to ascertain whether it correlates with ensemble performance in the literature, most of these measures are incorporated and explained in a non-evidential context. In this paper, we provide a modelling for formulating classifier outputs as triplet mass functions and a uniform notation for defining diversity measures. We then assess the relationship between diversity obtained by four pairwise and non-pairwise diversity measures and the improvement in accuracy of classifiers combined in different orders by Demspter’s rule of combination, Smets’ conjunctive rule, the Proportion and Yager’s rules in the framework of belief functions. Our experimental results demonstrate that the accuracy of classifiers combined by Dempster’s rule is not strongly correlated with the diversity obtained by the four measures, and the correlation between the diversity and the ensemble accuracy made by Proportion and Yager’s rules is negative, which is not in favor of the claim that increasing diversity could lead to reduction of generalization error of classifier ensembles. 相似文献

7.

Marginal and simultaneous predictive classification using stratified graphical models

Henrik Nyman Jie Xiong Johan Pensar Jukka Corander 《Advances in Data Analysis and Classification》2016,10(3):305-326

An inductive probabilistic classification rule must generally obey the principles of Bayesian predictive inference, such that all observed and unobserved stochastic quantities are jointly modeled and the parameter uncertainty is fully acknowledged through the posterior predictive distribution. Several such rules have been recently considered and their asymptotic behavior has been characterized under the assumption that the observed features or variables used for building a classifier are conditionally independent given a simultaneous labeling of both the training samples and those from an unknown origin. Here we extend the theoretical results to predictive classifiers acknowledging feature dependencies either through graphical models or sparser alternatives defined as stratified graphical models. We show through experimentation with both synthetic and real data that the predictive classifiers encoding dependencies have the potential to substantially improve classification accuracy compared with both standard discriminative classifiers and the predictive classifiers based on solely conditionally independent features. In most of our experiments stratified graphical models show an advantage over ordinary graphical models. 相似文献

8.

Classification of underwater signals using wavelet transforms and neural networks

《Mathematical and Computer Modelling》1998,27(2):47-60

Neural network classifiers have been widely used in classification due to its adaptive and parallel processing ability. This paper concerns classification of underwater passive sonar signals radiated by ships using neural networks. Classification process can be divided into two stages: one is the signal preprocessing and feature extraction, the other is the recognition process. In the preprocessing and feature extraction stage, the wavelet transform (WT) is used to extract tonal features from the average power spectral density (APSD) of the input data. In the classification stage, two kinds of neural network classifiers are used to evaluate the classification results, inclusive of the hyperplane-based classifier—Multilayer Perceptron (MLP)—and the kernel-based classifier—Adaptive Kernel Classifier (AKC). The experimental results obtained from MLP with different configurations and algorithms show that the bipolar continuous function possesses a wider range and a higher value of the learning rate than the unipolar continuous function. Besides, AKC with fixed radius (modified AKC) sometimes gives better performance than AKC, but the former takes more training time in selecting the width of the receptive field. More important, networks trained with tonal features extracted by WT has 96% or 94% correction rate, but the training with original APSDs only have 80% correction rate. 相似文献

9.

Direction-Projection-Permutation for High-Dimensional Hypothesis Tests

Susan Wei Chihoon Lee Lindsay Wichers J. S. Marron 《Journal of computational and graphical statistics》2016,25(2):549-569

High-dimensional low sample size (HDLSS) data are becoming increasingly common in statistical applications. When the data can be partitioned into two classes, a basic task is to construct a classifier that can assign objects to the correct class. Binary linear classifiers have been shown to be especially useful in HDLSS settings and preferable to more complicated classifiers because of their ease of interpretability. We propose a computational tool called direction-projection-permutation (DiProPerm), which rigorously assesses whether a binary linear classifier is detecting statistically significant differences between two high-dimensional distributions. The basic idea behind DiProPerm involves working directly with the one-dimensional projections of the data induced by binary linear classifier. Theoretical properties of DiProPerm are studied under the HDLSS asymptotic regime whereby dimension diverges to infinity while sample size remains fixed. We show that certain variations of DiProPerm are consistent and that consistency is a nontrivial property of tests in the HDLSS asymptotic regime. The practical utility of DiProPerm is demonstrated on HDLSS gene expression microarray datasets. Finally, an empirical power study is conducted comparing DiProPerm to several alternative two-sample HDLSS tests to understand the advantages and disadvantages of each method. 相似文献

10.

On the convergence of formally diverging neural net-based classifiers

Leonid Berlyand Pierre-Emmanuel Jabin 《Comptes Rendus Mathematique》2018,356(4):395-405

We present an analytical study of gradient descent algorithms applied to a classification problem in machine learning based on artificial neural networks. Our approach is based on entropy–entropy dissipation estimates that yield explicit rates. Specifically, as long as the neural nets remain within a set of “good classifiers”, we establish a striking feature of the algorithm: it mathematically diverges as the number of gradient descent iterations (“time”) goes to infinity but this divergence is only logarithmic, while the loss function vanishes polynomially. As a consequence, this algorithm still yields a classifier that exhibits good numerical performance and may even appear to converge. 相似文献

11.

A Feature Selection Newton Method for Support Vector Machine Classification 总被引：4，自引：1，他引：3

Glenn M. Fung O.L. Mangasarian 《Computational Optimization and Applications》2004,28(2):185-202

A fast Newton method, that suppresses input space features, is proposed for a linear programming formulation of support vector machine classifiers. The proposed stand-alone method can handle classification problems in very high dimensional spaces, such as 28,032 dimensions, and generates a classifier that depends on very few input features, such as 7 out of the original 28,032. The method can also handle problems with a large number of data points and requires no specialized linear programming packages but merely a linear equation solver. For nonlinear kernel classifiers, the method utilizes a minimal number of kernel functions in the classifier that it generates. 相似文献

12.

Multi-class classification using a signomial function

Kyoungmi Hwang Kyungsik Lee Chungmok Lee Sungsoo Park 《The Journal of the Operational Research Society》2015,66(3):434-449

We propose two multi-class classification methods using a signomial function. Each of these methods directly constructs a multi-class classifier by solving a single optimization problem. Since the number of possible signomial terms is extremely large, we propose a column generation method that iteratively generates good signomial terms. Both of these methods obtain better or comparable classification accuracies than existing methods and also provide more sparse classifiers. 相似文献

13.

Evidential calibration of binary SVM classifiers

《International Journal of Approximate Reasoning》2016

In machine learning problems, the availability of several classifiers trained on different data or features makes the combination of pattern classifiers of great interest. To combine distinct sources of information, it is necessary to represent the outputs of classifiers in a common space via a transformation called calibration. The most classical way is to use class membership probabilities. However, using a single probability measure may be insufficient to model the uncertainty induced by the calibration step, especially in the case of few training data. In this paper, we extend classical probabilistic calibration methods to the evidential framework. Experimental results from the calibration of SVM classifiers show the interest of using belief functions in classification problems. 相似文献

14.

Facial expression classification: An approach based on the fusion of facial deformations using the transferable belief model

《International Journal of Approximate Reasoning》2008,47(3):542-567

A method for the classification of facial expressions from the analysis of facial deformations is presented. This classification process is based on the transferable belief model (TBM) framework. Facial expressions are related to the six universal emotions, namely Joy, Surprise, Disgust, Sadness, Anger, Fear, as well as Neutral. The proposed classifier relies on data coming from a contour segmentation technique, which extracts an expression skeleton of facial features (mouth, eyes and eyebrows) and derives simple distance coefficients from every face image of a video sequence. The characteristic distances are fed to a rule-based decision system that relies on the TBM and data fusion in order to assign a facial expression to every face image. In the proposed work, we first demonstrate the feasibility of facial expression classification with simple data (only five facial distances are considered). We also demonstrate the efficiency of TBM for the purpose of emotion classification. The TBM based classifier was compared with a Bayesian classifier working on the same data. Both classifiers were tested on three different databases. 相似文献

15.

Quality-augmented fusion of level-2 and level-3 fingerprint information using DSm theory

Mayank Vatsa Richa Singh Afzel Noore Max M. Houck 《International Journal of Approximate Reasoning》2009,50(1):51-61

Existing algorithms that fuse level-2 and level-3 fingerprint match scores perform well when the number of features are adequate and the quality of images are acceptable. In practice, fingerprints collected under unconstrained environment neither guarantee the requisite image quality nor the minimum number of features required. This paper presents a novel fusion algorithm that combines fingerprint match scores to provide high accuracy under non-ideal conditions. The match scores obtained from level-2 and level-3 classifiers are first augmented with a quality score that is quantitatively determined by applying redundant discrete wavelet transform to a fingerprint image. We next apply the generalized belief functions of Dezert–Smarandache theory to effectively fuse the quality-augmented match scores obtained from level-2 and level-3 classifiers. Unlike statistical and learning based fusion techniques, the proposed plausible and paradoxical reasoning approach effectively mitigates conflicting decisions obtained from classifiers especially when the evidences are imprecise due to poor image quality or limited fingerprint features. The proposed quality-augmented fusion algorithm is validated using a comprehensive database which comprises of rolled and partial fingerprint images of varying quality with arbitrary number of features. The performance is compared with existing fusion approaches for different challenging realistic scenarios. 相似文献

16.

Flexible scan statistic test to detect disease clusters in hierarchical trees

Marcos O. Prates Renato M. Assun??o Marcelo A. Costa 《Computational Statistics》2012,27(4):715-737

相似文献

17.

Sparse optimization in feature selection: application in neuroimaging

K. Kampa S. Mehta C. A. Chou W. A. Chaovalitwongse T. J. Grabowski 《Journal of Global Optimization》2014,59(2-3):439-457

Feature selection plays an important role in the successful application of machine learning techniques to large real-world datasets. Avoiding model overfitting, especially when the number of features far exceeds the number of observations, requires selecting informative features and/or eliminating irrelevant ones. Searching for an optimal subset of features can be computationally expensive. Functional magnetic resonance imaging (fMRI) produces datasets with such characteristics creating challenges for applying machine learning techniques to classify cognitive states based on fMRI data. In this study, we present an embedded feature selection framework that integrates sparse optimization for regularization (or sparse regularization) and classification. This optimization approach attempts to maximize training accuracy while simultaneously enforcing sparsity by penalizing the objective function for the coefficients of the features. This process allows many coefficients to become zero, which effectively eliminates their corresponding features from the classification model. To demonstrate the utility of the approach, we apply our framework to three different real-world fMRI datasets. The results show that regularized classifiers yield better classification accuracy, especially when the number of initial features is large. The results further show that sparse regularization is key to achieving scientifically-relevant generalizability and functional localization of classifier features. The approach is thus highly suited for analysis of fMRI data. 相似文献

18.

A memetic approach to construct transductive discrete support vector machines

Hubertus Brandner Stefan Lessmann Stefan Voß 《European Journal of Operational Research》2013

Transductive learning involves the construction and application of prediction models to classify a fixed set of decision objects into discrete groups. It is a special case of classification analysis with important applications in web-mining, corporate planning and other areas. This paper proposes a novel transductive classifier that is based on the philosophy of discrete support vector machines. We formalize the task to estimate the class labels of decision objects as a mixed integer program. A memetic algorithm is developed to solve the mathematical program and to construct a transductive support vector machine classifier, respectively. Empirical experiments on synthetic and real-world data evidence the effectiveness of the new approach and demonstrate that it identifies high quality solutions in short time. Furthermore, the results suggest that the class predictions following from the memetic algorithm are significantly more accurate than the predictions of a CPLEX-based reference classifier. Comparisons to other transductive and inductive classifiers provide further support for our approach and suggest that it performs competitive with respect to several benchmarks. 相似文献

19.

Diverse reduct subspaces based co-training for partially labeled data 总被引：1，自引：0，他引：1

Duoqian Miao Can Gao Nan Zhang Zhifei Zhang 《International Journal of Approximate Reasoning》2011,52(8):1103-1117

Rough set theory is an effective supervised learning model for labeled data. However, it is often the case that practical problems involve both labeled and unlabeled data, which is outside the realm of traditional rough set theory. In this paper, the problem of attribute reduction for partially labeled data is first studied. With a new definition of discernibility matrix, a Markov blanket based heuristic algorithm is put forward to compute the optimal reduct of partially labeled data. A novel rough co-training model is then proposed, which could capitalize on the unlabeled data to improve the performance of rough classifier learned only from few labeled data. The model employs two diverse reducts of partially labeled data to train its base classifiers on the labeled data, and then makes the base classifiers learn from each other on the unlabeled data iteratively. The classifiers constructed in different reduct subspaces could benefit from their diversity on the unlabeled data and significantly improve the performance of the rough co-training model. Finally, the rough co-training model is theoretically analyzed, and the upper bound on its performance improvement is given. The experimental results show that the proposed model outperforms other representative models in terms of accuracy and even compares favorably with rough classifier trained on all training data labeled. 相似文献

20.

Interclass analysis in symbolic pattern classification problems

Manabu Ichino Shinya Ishikawa 《Computational Statistics》2006,21(2):309-323

相似文献