首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we consider a scale adjusted-type distance-based classifier for high-dimensional data. We first give such a classifier that can ensure high accuracy in misclassification rates for two-class classification. We show that the classifier is not only consistent but also asymptotically normal for high-dimensional data. We provide sample size determination so that misclassification rates are no more than a prespecified value. We propose a classification procedure called the misclassification rate adjusted classifier. We further develop the classifier to multiclass classification. We show that the classifier can still enjoy asymptotic properties and ensure high accuracy in misclassification rates for multiclass classification. Finally, we demonstrate the proposed classifier in actual data analyses by using a microarray data set.  相似文献   

2.
Neural network classifiers have been widely used in classification due to its adaptive and parallel processing ability. This paper concerns classification of underwater passive sonar signals radiated by ships using neural networks. Classification process can be divided into two stages: one is the signal preprocessing and feature extraction, the other is the recognition process. In the preprocessing and feature extraction stage, the wavelet transform (WT) is used to extract tonal features from the average power spectral density (APSD) of the input data. In the classification stage, two kinds of neural network classifiers are used to evaluate the classification results, inclusive of the hyperplane-based classifier—Multilayer Perceptron (MLP)—and the kernel-based classifier—Adaptive Kernel Classifier (AKC). The experimental results obtained from MLP with different configurations and algorithms show that the bipolar continuous function possesses a wider range and a higher value of the learning rate than the unipolar continuous function. Besides, AKC with fixed radius (modified AKC) sometimes gives better performance than AKC, but the former takes more training time in selecting the width of the receptive field. More important, networks trained with tonal features extracted by WT has 96% or 94% correction rate, but the training with original APSDs only have 80% correction rate.  相似文献   

3.
Terrorism with weapons of mass destruction (WMDs) is an urgent threat to homeland security. The process of counter-WMD terrorism often involves multiple government and terrorist group players, which is under-studied in the literature. In this paper, first we consider two subgames: a proliferation game between two terrorist groups or cells (where one handling the black market for profits proliferates to the other one to attack, and this is modelled as a terrorism supply chain) and a subsidization game between two governments (where one potential WMD victim government subsidizes the other host government, who can interfere with terrorist activities). Then we integrate these two subgames to study how the victim government can use the strategy of subsidization to induce the host government to disrupt the terrorism supply chain. To our knowledge, this is the first game-theoretic study for modelling and optimally disrupting a terrorism supply chain in a complex four-player scenario. We find that in the integrated game, when proliferation payment is high or low, the practical terrorist group will proliferate and not proliferate, respectively, regardless of government decisions. In contrast, in the subsidization subgame between the two governments, the decision of subsidization depends on its cost. When proliferation payment is medium, the decision of subsidization depends on not only its cost but also the preparation cost and the attacking cost. Findings from our results would assist in government policymaking.  相似文献   

4.
In this study, we present Incremental Learning and Decremented Characterization of Regularized Generalized Eigenvalue Classification (ILDC-ReGEC), a novel algorithm to train a generalized eigenvalue classifier with a substantially smaller subset of points and features of the original data. The proposed method provides a constructive way to understand the influence of new training data on an existing classification model and the grouping of features that determine the class of samples. We show through numerical experiments that this technique has comparable accuracy with respect to other methods. Furthermore, experiments show that it is possible to obtain a classification model with about 30% of the training samples and less then 5% of the initial features. Matlab implementation of the ILDC-ReGEC algorithm is freely available from the authors. Research partially supported by NSF and Air Force grants.  相似文献   

5.
Classifying magnetic resonance spectra is often difficult due to the curse of dimensionality; scenarios in which a high-dimensional feature space is coupled with a small sample size. We present an aggregation strategy that combines predicted disease states from multiple classifiers using several fuzzy integration variants. Rather than using all input features for each classifier, these multiple classifiers are presented with different, randomly selected, subsets of the spectral features. Results from a set of detailed experiments using this strategy are carefully compared against classification performance benchmarks. We empirically demonstrate that the aggregated predictions are consistently superior to the corresponding prediction from the best individual classifier.  相似文献   

6.
作战群体的类型识别是兵力聚合中的一个重要问题,针对类型识别所处理的情报及所使用的知识的不确定性,尤其是情报与知识在观点上的不明确性,本文提出了基于证据理论的情报表示及组合方法,并给出了群体类型的模板表示方法,进而提出了基于组合情报与模板模糊匹配的作战群体类型识别方法。该方法能应用于各个层次的兵力聚合过程,以辅助各个指挥层次的军事决策,提高决策的速度及效率。  相似文献   

7.
Feature selection plays an important role in the successful application of machine learning techniques to large real-world datasets. Avoiding model overfitting, especially when the number of features far exceeds the number of observations, requires selecting informative features and/or eliminating irrelevant ones. Searching for an optimal subset of features can be computationally expensive. Functional magnetic resonance imaging (fMRI) produces datasets with such characteristics creating challenges for applying machine learning techniques to classify cognitive states based on fMRI data. In this study, we present an embedded feature selection framework that integrates sparse optimization for regularization (or sparse regularization) and classification. This optimization approach attempts to maximize training accuracy while simultaneously enforcing sparsity by penalizing the objective function for the coefficients of the features. This process allows many coefficients to become zero, which effectively eliminates their corresponding features from the classification model. To demonstrate the utility of the approach, we apply our framework to three different real-world fMRI datasets. The results show that regularized classifiers yield better classification accuracy, especially when the number of initial features is large. The results further show that sparse regularization is key to achieving scientifically-relevant generalizability and functional localization of classifier features. The approach is thus highly suited for analysis of fMRI data.  相似文献   

8.
暴恐分子有意避开政府机关、机场等关键设施的严密防御范围,选择早市、火车站卖票口等尚未得到有效防御的人群密集场所发动袭击。本文考虑政府反恐力量防御拓扑特征,即政府反恐力量防控范围与恐怖分子发动攻击范围之间相离、相切、相交和相含等拓扑关系,构建了暴恐事件的演化博弈模型,分析多种情景下均衡稳定性,在Netlogo平台下对多种情景的理论结果进行社会模拟分析。结果表明:政府和恐怖分子行为演化均衡策略与政府防控范围、恐怖分子发动攻击范围、政府防控成本、政府防控收益等多种因素有关,随着政府进行有效防控的范围不断增加,恐怖分子选择袭击的可能性将不断减小,直到采取不攻击策略。  相似文献   

9.
10.
The problem to be addressed and tackled in this paper arose as a byproduct from some efforts at solving problems involving multiple goals by linking linear and goal programming models. The critical issue was that some forms for interdependence among the goals could not be handled in the programming models. Here we will deal with a set of goals — with realistic counterparts in a Finnish plywood industry — in which a subset of the goals are (i) conflicting, another subset (ii) unilaterally supporting and a third subset (iii) mutually supporting. It is furthermore observed that the elements of a studied set of goals may be partly independent and partly interdependent, which makes the context a fullfledged MCDM-problem. It is tackled with a technique which is based on the theory of fuzzy sets, the conceptual framework for fuzzy decisions and the algorithms developed for fuzzy mathematical programming. The resulting fuzzy multiobjective programming model is simplified and tested with the help of a fairly complex numerical example.  相似文献   

11.
The aim of this paper is to develop a Parallel Scatter Search metaheuristic for solving the Feature Subset Selection Problem in classification. Given a set of instances characterized by several features, the classification problem consists of assigning a class to each instance. Feature Subset Selection Problem selects a relevant subset of features from the initial set in order to classify future instances. We propose two methods for combining solutions in the Scatter Search metaheuristic. These methods provide two sequential algorithms that are compared with a recent Genetic Algorithm and with a parallelization of the Scatter Search. This parallelization is obtained by running simultaneously the two combination methods. Parallel Scatter Search presents better performance than the sequential algorithms.  相似文献   

12.
One issue in data classification problems is to find an optimal subset of instances to train a classifier. Training sets that represent well the characteristics of each class have better chances to build a successful predictor. There are cases where data are redundant or take large amounts of computing time in the learning process. To overcome this issue, instance selection techniques have been proposed. These techniques remove examples from the data set so that classifiers are built faster and, in some cases, with better accuracy. Some of these techniques are based on nearest neighbors, ordered removal, random sampling and evolutionary methods. The weaknesses of these methods generally involve lack of accuracy, overfitting, lack of robustness when the data set size increases and high complexity. This work proposes a simple and fast immune-inspired suppressive algorithm for instance selection, called SeleSup. According to self-regulation mechanisms, those cells unable to neutralize danger tend to disappear from the organism. Therefore, by analogy, data not relevant to the learning of a classifier are eliminated from the training process. The proposed method was compared with three important instance selection algorithms on a number of data sets. The experiments showed that our mechanism substantially reduces the data set size and is accurate and robust, specially on larger data sets.  相似文献   

13.
In developing a classification model for assigning observations of unknown class to one of a number of specified classes using the values of a set of features associated with each observation, it is often desirable to base the classifier on a limited number of features. Mathematical programming discriminant analysis methods for developing classification models can be extended for feature selection. Classification accuracy can be used as the feature selection criterion by using a mixed integer programming (MIP) model in which a binary variable is associated with each training sample observation, but the binary variable requirements limit the size of problems to which this approach can be applied. Heuristic feature selection methods for problems with large numbers of observations are developed in this paper. These heuristic procedures, which are based on the MIP model for maximizing classification accuracy, are then applied to three credit scoring data sets.  相似文献   

14.
Supervised classification is one of the most used methods in machine learning. In case of data characterized by a large number of features, a critical issue is to deal with redundant or irrelevant information. To this extent, an effective algorithm needs to identify a suitable subset of features, as small as possible, for the classification. In this work we present ReGEC_L1, a classifier with embedded feature selection based on the Regularized Generalized Eigenvalue Classifier (ReGEC) and equipped with a L1-norm regularization term. We detail the mathematical formulation and the numerical algorithm. Numerical results, obtained on some de facto standard benchmark data sets, show that the approach we propose produces a remarkable selection of the features, without losing accuracy in the classification. In that respect, our algorithm seems to compare favorably with the SVM_L1 method. A MATLAB implementation of ReGEC_L1 is available at http://www.na.icar.cnr.it/~mariog/regec_l1.html.  相似文献   

15.
In this paper, the classification power of the eigenvalues of six graph-associated matrices is investigated. Each matrix contains a certain type of geometric/ spatial information, which may be important for the classification process. The performances of the different feature types is evaluated on two data sets: first a benchmark data set for optical character recognition, where the extracted eigenvalues were utilized as feature vectors for multi-class classification using support vector machines. Classification results are presented for all six feature types, as well as for classifier combinations at decision level. For the decision level combination, probabilistic output support vector machines have been applied, with a performance up to 92.4 %. To investigate the power of the spectra for time dependent tasks, too, a second data set was investigated, consisting of human activities in video streams. To model the time dependency, hidden Markov models were utilized and the classification rate reached 98.3 %.  相似文献   

16.
A knowledge-based linear Tihkonov regularization classification model for tornado discrimination is presented. Twenty-three attributes, based on the National Severe Storms Laboratory’s Mesoscale Detection Algorithm, are used as prior knowledge. Threshold values for these attributes are employed to discriminate the data into two classes (tornado, non-tornado). The Weather Surveillance Radar 1998 Doppler is used as a source of data streaming every 6 min. The combination of data and prior knowledge is used in the development of a least squares problem that can be solved using matrix or iterative methods. Advantages of this formulation include explicit expressions for the classification weights of the classifier and its ability to incorporate and handle prior knowledge directly to the classifiers. Comparison of the present approach to that of Fung et al. [in Proceedings neural information processing systems (NIPS 2002), Vancouver, BC, December 10–12, 2002], over a suite of forecast evaluation indices, demonstrates that the Tikhonov regularization model is superior for discriminating tornadic from non-tornadic storms.  相似文献   

17.
随着基因嫁接技术的日益成熟,人类可能面临新一轮全球基因武器竞赛.科学家们就发出警告说,能够使一个种族从地球上消失的基因武器有可能在5年内变成现实,原因是阻止生物物科技武器发展"重中之重"——基因武器发展的"机会窗口"正在缩小,因此对基因武器的发展进行研究具有一定的理论和实践意义.论述了基因武器发展对我国安全的威胁以及难以控制的理由,建立了基因武器杀伤模型.  相似文献   

18.
陶朝杰  杨进 《经济数学》2020,37(3):214-220
虚假评论是电商发展过程中一个无法避免的难题. 针对在线评论数据中样本类别不平衡情况,提出基于BalanceCascade-GBDT算法的虚假评论识别方法. BalanceCascade算法通过设置分类器的误报率逐步缩小大类样本空间,然后集成所有基分类器构建最终分类器. GBDT以其高准确性和可解释性被广泛应用于分类问题中,并且作为样本扰动不稳定算法,是十分合适的基分类模型. 模型基于Yelp评论数据集,采用AUC值作为评价指标,并与逻辑回归、随机森林以及神经网络算法进行对比,实验证明了该方法的有效性.  相似文献   

19.
The feature selection problem is an interesting and important topic which is relevant for a variety of database applications. This paper utilizes the Tabu Search metaheuristic algorithm to implement a feature subset selection procedure while the nearest neighbor classification method is used for the classification task. Tabu Search is a general metaheuristic procedure that is used in order to guide the search to obtain good solutions in complex solution spaces. Several metrics are used in the nearest neighbor classification method, such as the euclidean distance, the Standardized Euclidean distance, the Mahalanobis distance, the City block metric, the Cosine distance and the Correlation distance, in order to identify the most significant metric for the nearest neighbor classifier. The performance of the proposed algorithms is tested using various benchmark datasets from UCI Machine Learning Repository.  相似文献   

20.
In this paper, we propose a genetic programming (GP) based approach to evolve fuzzy rule based classifiers. For a c-class problem, a classifier consists of c trees. Each tree, T i , of the multi-tree classifier represents a set of rules for class i. During the evolutionary process, the inaccurate/inactive rules of the initial set of rules are removed by a cleaning scheme. This allows good rules to sustain and that eventually determines the number of rules. In the beginning, our GP scheme uses a randomly selected subset of features and then evolves the features to be used in each rule. The initial rules are constructed using prototypes, which are generated randomly as well as by the fuzzy k-means (FKM) algorithm. Besides, experiments are conducted in three different ways: Using only randomly generated rules, using a mixture of randomly generated rules and FKM prototype based rules, and with exclusively FKM prototype based rules. The performance of the classifiers is comparable irrespective of the type of initial rules. This emphasizes the novelty of the proposed evolutionary scheme. In this context, we propose a new mutation operation to alter the rule parameters. The GP scheme optimizes the structure of rules as well as the parameters involved. The method is validated on six benchmark data sets and the performance of the proposed scheme is found to be satisfactory.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号