首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 42 毫秒
1.
Feature selection of noise sources is important for noise sources detection and classification. In this paper, a new rough set based feature selection method has been given. Based on the method, a noise sources automatic classification system (NSACS) has been designed and validated. The key idea of the method is that most effective features can distinguish the most number of samples belonging to different classes of noise sources, if they are used for classification. This new approach has been applied into the system NSACS to select relevant features for artificial datasets and real-world datasets and the results have shown that this approach can correctly select all the relevant features of artificial datasets and at the same time it can drastically reduce the number of features. From the experiments, it can be found that to consider all the five datasets, the number of classification features after selection drops to 35% and the accurate classification rate increases about 14%. For the underwater noise sources dataset the number of features drops to 1/5 and the accurate classification rate increases about 6% after feature selection.  相似文献   

2.
Raman spectroscopy has the potential to significantly aid in the research and diagnosis of cancer. The information dense, complex spectra generate massive datasets in which subtle correlations may provide critical clues for biological analysis and pathological classification. Therefore, implementing advanced data mining techniques is imperative for complete, rapid and accurate spectral processing. Numerous recent studies have employed various data methods to Raman spectra for classification and biochemical analysis. Although, as Raman datasets from biological specimens are often characterized by high dimensionality and low sample numbers, many of these classification models are subject to overfitting. Furthermore, attempts to reduce dimensionality result in transformed feature spaces making the biological evaluation of significant and discriminative spectral features problematic. We have developed a novel data mining framework optimized for Raman datasets, called Fisher‐based Feature Selection Support Vector Machines (FFS‐SVM). This framework provides simultaneous supervised classification and user‐defined Fisher criterion‐based feature selection, reducing overfitting and directly yielding significant wavenumbers from the original feature space. Herein, we investigate five cancerous and non‐cancerous breast cell lines using Raman microspectroscopy and our unique FFS‐SVM framework. Our framework classification performance is then compared to several other frequently employed classification methods on four classification tasks. The four tasks were constructed by an unsupervised clustering method yielding the four different categories of cell line groupings (e.g. cancer vs non‐cancer) studied. FFS‐SVM achieves both high classification accuracies and the extraction of biologically significant features. The top ten most discriminative features are discussed in terms of cell‐type specific biological relevance. Our framework provides comprehensive cellular level characterization and could potentially lead to the discovery of cancer biomarker‐type information, which we have informally termed ‘Raman‐based spectral biomarkers’. The FFS‐SVM framework along with Raman spectroscopy will be used in future studies to investigate in‐situ dynamic biological phenomena. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

3.
In recent years, on the basis of drawing lessons from traditional neural network models, people have been paying more and more attention to the design of neural network architectures for processing graph structure data, which are called graph neural networks (GNN). GCN, namely, graph convolution networks, are neural network models in GNN. GCN extends the convolution operation from traditional data (such as images) to graph data, and it is essentially a feature extractor, which aggregates the features of neighborhood nodes into those of target nodes. In the process of aggregating features, GCN uses the Laplacian matrix to assign different importance to the nodes in the neighborhood of the target nodes. Since graph-structured data are inherently non-Euclidean, we seek to use a non-Euclidean mathematical tool, namely, Riemannian geometry, to analyze graphs (networks). In this paper, we present a novel model for semi-supervised learning called the Ricci curvature-based graph convolutional neural network, i.e., RCGCN. The aggregation pattern of RCGCN is inspired by that of GCN. We regard the network as a discrete manifold, and then use Ricci curvature to assign different importance to the nodes within the neighborhood of the target nodes. Ricci curvature is related to the optimal transport distance, which can well reflect the geometric structure of the underlying space of the network. The node importance given by Ricci curvature can better reflect the relationships between the target node and the nodes in the neighborhood. The proposed model scales linearly with the number of edges in the network. Experiments demonstrated that RCGCN achieves a significant performance gain over baseline methods on benchmark datasets.  相似文献   

4.
The domain adaptation problem in transfer learning has received extensive attention in recent years. The existing transfer model for solving domain alignment always assumes that the label space is completely shared between domains. However, this assumption is untrue in the actual industry and limits the application scope of the transfer model. Therefore, a universal domain method is proposed, which not only effectively reduces the problem of network failure caused by unknown fault types in the target domain but also breaks the premise of sharing the label space. The proposed framework takes into account the discrepancy of the fault features shown by different fault types and forms the feature center for fault diagnosis by extracting the features of samples of each fault type. Three optimization functions are added to solve the negative transfer problem when the model solves samples of unknown fault types. This study verifies the performance advantages of the framework for variable speed through experiments of multiple datasets. It can be seen from the experimental results that the proposed method has better fault diagnosis performance than related transfer methods for solving unknown mechanical faults.  相似文献   

5.
一种多模型融合的近红外波长选择算法   总被引:2,自引:0,他引:2  
针对近红外光谱数据的特点,在分析了单模型波长选择方法的基础上,提出了一种多模型融合的变量选择方法。它融合多个模型的回归系数,以提高波长选择的准确性和稳定性。并用3个业界标准的近红外光谱数据集对提出的方法进行了验证,同时与UVE-PLS和GA-PLS算法进行了比较。实验结果表明,经该方法选择变量后,提高了模型的预测能力,降低了复杂度,达到甚至优于UVE-PLS和GA-PLS,而且具有算法简单、效率高的优点,具有广泛的实用价值。  相似文献   

6.
Secure user access to devices and datasets is widely enabled by fingerprint or face recognition. Organization of the necessarily large secure digital object datasets, with objects having content that may consist of images, text, video or audio, involves efficient classification and feature retrieval processing. This usually will require multidimensional methods applicable to data that is represented through a family of probability distributions. Then information geometry is an appropriate context in which to provide for such analytic work, whether with maximum likelihood fitted distributions or empirical frequency distributions. The important provision is of a natural geometric measure structure on families of probability distributions by representing them as Riemannian manifolds. Then the distributions are points lying in this geometrical manifold, different features can be identified and dissimilarities computed, so that neighbourhoods of objects nearby a given example object can be constructed. This can reveal clustering and projections onto smaller eigen-subspaces which can make comparisons easier to interpret. Geodesic distances can be used as a natural dissimilarity metric applied over data described by probability distributions. Exploring this property, we propose a new face recognition method which scores dissimilarities between face images by multiplying geodesic distance approximations between 3-variate RGB Gaussians representative of colour face images, and also obtaining joint probabilities. The experimental results show that this new method is more successful in recognition rates than published comparative state-of-the-art methods.  相似文献   

7.
Feature selection and feature extraction are the most important steps in classification and regression systems. Feature selection is commonly used to reduce the dimensionality of datasets with tens or hundreds of thousands of features, which would be impossible to process further. Recent example includes quantitative structure–activity relationships (QSAR) dataset including 1226 features. A major problem of QSAR is the high dimensionality of the feature space; therefore, feature selection is the most important step in this study. This paper presents a novel feature selection algorithm that is based on entropy. The performance of the proposed algorithm is compared with that of a genetic algorithm method and a stepwise regression method. The root mean square error of prediction in a QSAR study using entropy, genetic algorithm and stepwise regression using multiple linear regressions model for training set and test set were 0.3433, 0.3591 and 0.5500, 0.4326 and 0.6373, 0.6672, respectively.  相似文献   

8.
In time-varying scientific datasets from simulations or experimental observations, scientists always need to understand when and where interesting events occur. An event is a complex spatial and temporal pattern that happens over a course of timesteps and includes the involved features and interactions. Event detection allows scientists to query a time-varying dataset from a much smaller set of possible choices. However, with many events detected from a dataset, each spanning different time intervals, querying and visualizing these events pose a challenge. In this work, we propose a framework for the visualization of events in time-varying scientific datasets. Our method extracts features from a data, tracks features over time, and saves the evolution process of features in an event database where a set of database operations are provided to model an event by defining the stages or individual steps that make up an event. Using the feature metadata and the event database, three types of event visualizations can be created to give a unique insight into the dynamics of data from temporal, spatial, and physical perspectives and to summarize multiple events or even the whole dataset. Three case studies are used to demonstrate the usability and effectiveness of the proposed approach.  相似文献   

9.
Selection of biologically relevant genes from high-dimensional expression data is a key research problem in gene expression genomics. Most of the available gene selection methods are either based on relevancy or redundancy measure, which are usually adjudged through post selection classification accuracy. Through these methods the ranking of genes was conducted on a single high-dimensional expression data, which led to the selection of spuriously associated and redundant genes. Hence, we developed a statistical approach through combining a support vector machine with Maximum Relevance and Minimum Redundancy under a sound statistical setup for the selection of biologically relevant genes. Here, the genes were selected through statistical significance values and computed using a nonparametric test statistic under a bootstrap-based subject sampling model. Further, a systematic and rigorous evaluation of the proposed approach with nine existing competitive methods was carried on six different real crop gene expression datasets. This performance analysis was carried out under three comparison settings, i.e., subject classification, biological relevant criteria based on quantitative trait loci and gene ontology. Our analytical results showed that the proposed approach selects genes which are more biologically relevant as compared to the existing methods. Moreover, the proposed approach was also found to be better with respect to the competitive existing methods. The proposed statistical approach provides a framework for combining filter and wrapper methods of gene selection.  相似文献   

10.
Existing kernel-based correlation analysis methods mainly adopt a single kernel in each view. However, only a single kernel is usually insufficient to characterize nonlinear distribution information of a view. To solve the problem, we transform each original feature vector into a 2-dimensional feature matrix by means of kernel alignment, and then propose a novel kernel-aligned multi-view canonical correlation analysis (KAMCCA) method on the basis of the feature matrices. Our proposed method can simultaneously employ multiple kernels to better capture the nonlinear distribution information of each view, so that correlation features learned by KAMCCA can have well discriminating power in real-world image recognition. Extensive experiments are designed on five real-world image datasets, including NIR face images, thermal face images, visible face images, handwritten digit images, and object images. Promising experimental results on the datasets have manifested the effectiveness of our proposed method.  相似文献   

11.
In this paper, we propose a novel classification framework using single feature kernel matrix. Different from the traditional kernel matrices which make use of the whole features of samples to build the kernel matrix, this research uses features of the same dimension of any two samples to build a sub-kernel matrix and sums up all the sub-kernel matrices to get the single feature kernel matrix. We also use single feature kernel matrix to build a new SVM classifier, and adapt SMO (Sequential Minimal Optimization) algorithm to solve the problem of SVM classifier. The results of the experiments on several artificial datasets and some challenging public cancer datasets display the classification performance of the algorithm. The comparisons between our algorithm and L2-norm SVM on the cancer datasets demonstrate that the accuracy of our algorithm is higher, and the number of support vectors selected is fewer, indicating that our proposed framework is a more practical approach.  相似文献   

12.
The unavoidable noise often present in synthetic aperture radar (SAR) images, such as speckle noise, negatively impacts the subsequent processing of SAR images. Further, it is not easy to find an appropriate application for SAR images, given that the human visual system is sensitive to color and SAR images are gray. As a result, a noisy SAR image fusion method based on nonlocal matching and generative adversarial networks is presented in this paper. A nonlocal matching method is applied to processing source images into similar block groups in the pre-processing step. Then, adversarial networks are employed to generate a final noise-free fused SAR image block, where the generator aims to generate a noise-free SAR image block with color information, and the discriminator tries to increase the spatial resolution of the generated image block. This step ensures that the fused image block contains high resolution and color information at the same time. Finally, a fused image can be obtained by aggregating all the image blocks. By extensive comparative experiments on the SEN1–2 datasets and source images, it can be found that the proposed method not only has better fusion results but is also robust to image noise, indicating the superiority of the proposed noisy SAR image fusion method over the state-of-the-art methods.  相似文献   

13.
Yan. Ouyang  Nong. Sang  Rui. Huang 《Optik》2013,124(24):6827-6833
Recently the sparse representation based classification (SRC) is successfully used to automatically recognize facial expression, well-known for its ability to solve occlusion and corruption problems. The results of those methods which using different features conjunction with SRC framework show state of the art performance on clean or noised facial expression images. Therefore, the role of feature extraction for SRC framework will greatly affect the success of facial expression recognition (FER). In this paper, we select a new feature which called LBP map. This feature is generated using local binary pattern (LBP) operator. It is not only robust to gray-scale variation, but also extracts sufficient texture information for SRC to deal with FER problem. Then we proposed a new method using the LBP map conjunction with the SRC framework. Firstly, we compared our method with state of the art published work. Then experiments on the Cohn–Kanade database show that the LBP map + SRC can reach the highest accuracy with the lowest time-consuming on clean face images than those methods which use different features such as raw image, Downsample image, Eigenfaces, Laplacianfaces and Gabor conjunction with SRC. We also experiment the LBP map + SRC to recognize face image with partial occluded and corrupted, the result shows that this method is more robust to occlusion and corruption than existing methods based on SRC framework.  相似文献   

14.
Imbalance ensemble classification is one of the most essential and practical strategies for improving decision performance in data analysis. There is a growing body of literature about ensemble techniques for imbalance learning in recent years, the various extensions of imbalanced classification methods were established from different points of view. The present study is initiated in an attempt to review the state-of-the-art ensemble classification algorithms for dealing with imbalanced datasets, offering a comprehensive analysis for incorporating the dynamic selection of base classifiers in classification. By conducting 14 existing ensemble algorithms incorporating a dynamic selection on 56 datasets, the experimental results reveal that the classical algorithm with a dynamic selection strategy deliver a practical way to improve the classification performance for both a binary class and multi-class imbalanced datasets. In addition, by combining patch learning with a dynamic selection ensemble classification, a patch-ensemble classification method is designed, which utilizes the misclassified samples to train patch classifiers for increasing the diversity of base classifiers. The experiments’ results indicate that the designed method has a certain potential for the performance of multi-class imbalanced classification.  相似文献   

15.
This paper studies an intelligent reflect surface (IRS) aided mobile edge computing (MEC) network, where the direct link exists in the network can assist the task transmission for computing with the help of multiple elements in the IRS. We perform the performance evaluation by instigating the impact of direct link on the outage probability. Specifically, Firstly, we analyze the system outage probability (SOP) with a different number of reflecting elements and energy consumption constraints. Moreover, we propose two selection methods for the case of multiple reflecting elements. In particular, Method I maximizes the first-hop reflecting channel while Method II maximizes the dual-hop product channel. In further, for the two different methods, we estimate the outage probability of the system by considering the reflecting channel information and providing the analytic expression of the outage probability, respectively. Finally, the numerical results verify the correctness of our results. The results show that increasing the number of reflecting elements can effectively reduce the SOP.  相似文献   

16.
With the widespread use of intelligent information systems, a massive amount of data with lots of irrelevant, noisy, and redundant features are collected; moreover, many features should be handled. Therefore, introducing an efficient feature selection (FS) approach becomes a challenging aim. In the recent decade, various artificial methods and swarm models inspired by biological and social systems have been proposed to solve different problems, including FS. Thus, in this paper, an innovative approach is proposed based on a hybrid integration between two intelligent algorithms, Electric fish optimization (EFO) and the arithmetic optimization algorithm (AOA), to boost the exploration stage of EFO to process the high dimensional FS problems with a remarkable convergence speed. The proposed EFOAOA is examined with eighteen datasets for different real-life applications. The EFOAOA results are compared with a set of recent state-of-the-art optimizers using a set of statistical metrics and the Friedman test. The comparisons show the positive impact of integrating the AOA operator in the EFO, as the proposed EFOAOA can identify the most important features with high accuracy and efficiency. Compared to the other FS methods whereas, it got the lowest features number and the highest accuracy in 50% and 67% of the datasets, respectively.  相似文献   

17.
何群  王煜文  杜硕  陈晓玲  谢平 《物理学报》2018,67(11):118701-118701
运动想象模式识别率的提高对脑机接口(BCI)技术的应用具有重要意义,本文采用自适应无参经验小波变换(APEWT)和选择集成分类模型相结合的方法提高脑电(EEG)信号的分类识别准确率.首先,通过APEWT将EEG信号分解成不同的模态;然后,使用最优模态重构后的信号计算其能量谱(ES)特征,使用最优模态分量计算其边际谱(MS)特征;最后,将不同时间段的ES特征和不同频段的MS特征输入到构建的选择集成分类模型中,从而得到其分类结果,并将该方法与其他4种组合方法进行比较.实验结果表明,本文方法具有较好分类准确率和实时性,其平均分类正确率高于其他4种方法,同时较近期使用相同数据的文献也有优势.本文为在线运动想象类BCI的应用提供了新的方法和思路.  相似文献   

18.
In order to increase the classification accuracy, a new feature selection method, RFFIM-PCA, based on the random forest feature importance measure (RFFIM) and principal component analysis (PCA) for analyzing the near-infrared (NIR) spectra of tobacco, is presented in this paper. We applied the method to the classification of cigarettes' qualitative evaluation and also compared it with other methods. The result showed that RFFIM-PCA discriminates the high-dimensional data effectively and can be used to identify the cigarettes' quality. The feature selection filters the noises, while PCA eliminates the redundant features and reduces the dimensionalities as well. The experimental results showed that RFFIM-PCA successfully eliminated the noises and redundant features in high-dimensional data, leading to a promising improvement on the feature selection and classification accuracy.  相似文献   

19.
光谱预处理方法选择研究   总被引:1,自引:0,他引:1  
复杂样品光谱信号往往会受到杂散光、噪声、基线漂移等因素的干扰,从而影响最终的定性定量分析结果,因此通常需要在建模前对原始光谱进行预处理。目前已有的光谱预处理方法包括很多种,如何寻找合适的预处理方法是很棘手的问题。一种途径是观察光谱信号特点选择预处理方法(visual inspection),另一种途径是根据建模性能的优劣反过来选择预处理方法(trial-and-error strategy)。前者无需建模,更具有解释性,但是有时会由于选择者主观的因素导致错误的结果;后者无需观察光谱特点,但需要考察大量的预处理方法,对大数据集比较费时。因此需要探讨哪种选择方式更科学与合理。本研究采用9组数据,通过对10种预处理方法的120种排列组合来探讨预处理的必要性及预处理方法的选择。首先,优化偏最小二乘(PLS)的因子数及一阶导数、二阶导数、SG平滑的窗口参数,连续小波变换(CWT)的小波函数和分解尺度。然后把无预处理及一阶导数、二阶导数、CWT、多元散射校正(MSC)、标准正态变量(SNV)、SG平滑、中心化、Pareto尺度化、最大最小归一化、标准化10种预处理方法按照背景校正、散射校正、平滑和尺度化的顺序进行排列组合,得到120种预处理及其组合方法。最后对不同数据及相同数据的不同组分分别进行120种预处理,分析光谱信号特点及预处理后PLS建模的预测均方根误差值(RMSEP)。结果表明,相比观察光谱信号特点,根据光谱与预测组分的建模效果可以更为准确地选择最佳预处理方法。对于多数数据,采用合适的预处理方法可以提高建模效果;对于不同的数据集,因为其数据集信息和复杂性不同,所以其最佳预处理方法也不同;对于相同数据集,即使光谱相同,但不同组分的预处理方法也不相同。因此,不存在普适性的最佳预处理方法,最佳预处理方法除了与光谱有关,还与预测组分有关。通过对已有预处理方法按照预处理目的进行分类再排列组合是选择最佳预处理方法的一种有效途径。  相似文献   

20.
Anisotropic diffusion (AD) has proven to be very effective in the denoising of magnetic resonance (MR) images. The result of AD filtering is highly dependent on several parameters, especially the conductance parameter. However, there is no automatic method to select the optimal parameter values. This paper presents a general strategy for AD filtering of MR images using an automatic parameter selection method. The basic idea is to estimate the parameters through an optimization step on a synthetic image model, which is different from traditional analytical methods. This approach can be easily applied to more sophisticated diffusion models for better denoising results. We conducted a systematic study of parameter selection for the AD filter, including the dynamic parameter decreasing rate, the parameter selection range for different noise levels and the influence of the image contrast on parameter selection. The proposed approach was validated using both simulated and real MR images. The model image generated using our approach was shown to be highly suitable for the purpose of parameter optimization. The results confirm that our method outperforms most state-of-the-art methods in both quantitative measurement and visual evaluation. By testing on real images with different noise levels, we demonstrated that our method is sufficiently general to be applied to a variety of MR images.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号