首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, a novel feature selection algorithm for inference from high-dimensional data (FASTENER) is presented. With its multi-objective approach, the algorithm tries to maximize the accuracy of a machine learning algorithm with as few features as possible. The algorithm exploits entropy-based measures, such as mutual information in the crossover phase of the iterative genetic approach. FASTENER converges to a (near) optimal subset of features faster than other multi-objective wrapper methods, such as POSS, DT-forward and FS-SDS, and achieves better classification accuracy than similarity and information theory-based methods currently utilized in earth observation scenarios. The approach was primarily evaluated using the earth observation data set for land-cover classification from ESA’s Sentinel-2 mission, the digital elevation model and the ground truth data of the Land Parcel Identification System from Slovenia. For land cover classification, the algorithm gives state-of-the-art results. Additionally, FASTENER was tested on open feature selection data sets and compared to the state-of-the-art methods. With fewer model evaluations, the algorithm yields comparable results to DT-forward and is superior to FS-SDS. FASTENER can be used in any supervised machine learning scenario.  相似文献   

2.
With the rapid growth of the Internet, the curse of dimensionality caused by massive multi-label data has attracted extensive attention. Feature selection plays an indispensable role in dimensionality reduction processing. Many researchers have focused on this subject based on information theory. Here, to evaluate feature relevance, a novel feature relevance term (FR) that employs three incremental information terms to comprehensively consider three key aspects (candidate features, selected features, and label correlations) is designed. A thorough examination of the three key aspects of FR outlined above is more favorable to capturing the optimal features. Moreover, we employ label-related feature redundancy as the label-related feature redundancy term (LR) to reduce unnecessary redundancy. Therefore, a designed multi-label feature selection method that integrates FR with LR is proposed, namely, Feature Selection combining three types of Conditional Relevance (TCRFS). Numerous experiments indicate that TCRFS outperforms the other 6 state-of-the-art multi-label approaches on 13 multi-label benchmark data sets from 4 domains.  相似文献   

3.
基于支持向量机(SVM)特征加权/选择的光谱匹配算法   总被引:2,自引:1,他引:1  
高光谱数据波段多、冗余大,为了提高数据的分析效率和精度,降维是一个关键步骤。文章在文献(参考了后面的文献[18])研究的基础上,引入了迭代SVM特征选择/加权算法,为多目标遗传优化获取最优参考光谱提供一个包含有效分类信息的低维空间。基于Indiana-AVIRIS高光谱数据的实验表明,特征加权/选择的引入使光谱匹配分类精度提高了13%(相对于无特征选择的情况而言)。文章还根据光谱样本距SVM分类面的远近,定义和计算了局部权重,不仅细致刻画了同类光谱样本在局部特征空间中的分布,还使光谱相似度的计算更加灵活化,精度提高幅度达到了17%(相对于无特征选择的情况而言)。文章研究方法的提出推进了SVM在光谱数据分析中的应用深度和广度。  相似文献   

4.
Currently, the world is still facing a COVID-19 (coronavirus disease 2019) classified as a highly infectious disease due to its rapid spreading. The shortage of X-ray machines may lead to critical situations and delay the diagnosis results, increasing the number of deaths. Therefore, the exploitation of deep learning (DL) and optimization algorithms can be advantageous in early diagnosis and COVID-19 detection. In this paper, we propose a framework for COVID-19 images classification using hybridization of DL and swarm-based algorithms. The MobileNetV3 is used as a backbone feature extraction to learn and extract relevant image representations as a DL model. As a swarm-based algorithm, the Aquila Optimizer (Aqu) is used as a feature selector to reduce the dimensionality of the image representations and improve the classification accuracy using only the most essential selected features. To validate the proposed framework, two datasets with X-ray and CT COVID-19 images are used. The obtained results from the experiments show a good performance of the proposed framework in terms of classification accuracy and dimensionality reduction during the feature extraction and selection phases. The Aqu feature selection algorithm achieves accuracy better than other methods in terms of performance metrics.  相似文献   

5.
逄岩  许枫  刘佳  李益丞  赵越 《声学学报》2023,48(1):83-92
针对侧扫声呐获取类型复杂的海底底质数据分类问题,提出联合特征选择与改进Stacking模型的数据自驱动分类方法。该方法首先在海底散射数据多域态特征的基础上采用ReliefF算法提取有效的低维度特征,然后将人工鱼群算法与Stacking模型结合形成改进集成学习分类器,完成海底底质分类。海上数据处理结果表明该方法可对多种海底底质类型进行分类,分类准确率、Kappa系数和F1-score分别达到85.55%,0.857和0.887,证明了该方法的有效性。   相似文献   

6.
为了有效解决打印文件机源认证问题,提出了一种基于统计纹理特征选择的打印文件机源认证方法。综合考虑打印字符图像的空间域和时频域特性,将GLCM和DWT统计纹理特征进行组合,运用ReliefF算法实现组合特征的初选,二次特征选择使用SVM-RFE算法。文中实验结果表明,在英文相同字有重复样本集和中文不同字无重复样本集上的分类准确率分别为95.20%和75.00%;特征组合与特征选择有利于提高打印文件机源认证的分类鉴别性能。  相似文献   

7.
高光谱图像具有数百个连续、狭窄的光谱带,光谱范围跨越可见光到红外光,可提供地物的精细光谱属性,对于地物材质和属性的识别分类具有重要应用价值。针对感兴趣目标选择有限的光谱波段进行传输和处理,对于提升高光谱数据处理时效性、以及设计面向特定应用的实用化光谱仪都具有重要意义。而如何结合目标特征选择最优波段成为在提升处理效率的同时保证目标识别或分类精度的必然要求。因此如何从数以百计维度的高光谱图像中选择出具有较好分类识别能力的波段子集是急需解决的问题。提出基于改进粒子群优化算法的高光谱波段选择方法,该方法区别于传统的粒子群优化算法,引入 “概率突跳特性”,并设定新解的淘汰机制,将“停滞”的新解进行淘汰,提高了算法的全局寻优性能。然后基于目标光谱特征采用了最优波段选择的优化目标函数,通过改进的粒子群优化算法求解目标函数,并将选定的波段子集反馈到支持向量机(SVM)中执行分类应用。采用两个标准的高光谱数据集(Indian Pines, Salinas)对选择出的波段子集进行分类测试,结果表明该方法相较于现有方法具有较高的分类精度,在几种方法中,传统的粒子群算法筛选出的波段效果最差;该算法筛选出的波段的分类精度最好,两个数据集的分类精度分别可以达到98.141 4%和99.084 8%。  相似文献   

8.
The comprehensively completed BDS-3 short-message communication system, known as the short-message satellite communication system (SMSCS), will be widely used in traditional blind communication areas in the future. However, short-message processing resources for short-message satellites are relatively scarce. To improve the resource utilization of satellite systems and ensure the service quality of the short-message terminal is adequate, it is necessary to allocate and schedule short-message satellite processing resources in a multi-satellite coverage area. In order to solve the above problems, a short-message satellite resource allocation algorithm based on deep reinforcement learning (DRL-SRA) is proposed. First of all, using the characteristics of the SMSCS, a multi-objective joint optimization satellite resource allocation model is established to reduce short-message terminal path transmission loss, and achieve satellite load balancing and an adequate quality of service. Then, the number of input data dimensions is reduced using the region division strategy and a feature extraction network. The continuous spatial state is parameterized with a deep reinforcement learning algorithm based on the deep deterministic policy gradient (DDPG) framework. The simulation results show that the proposed algorithm can reduce the transmission loss of the short-message terminal path, improve the quality of service, and increase the resource utilization efficiency of the short-message satellite system while ensuring an appropriate satellite load balance.  相似文献   

9.
陈涵瀛  高璞珍  谭思超  付学宽 《物理学报》2014,63(20):200505-200505
极限学习机是近年来提出的一种前向单隐层神经网络训练算法,具有训练速度快、不会陷入局部最优等优点,但其性能会受到随机选取的输入权值和阈值的影响.针对这一问题,提出一种基于多目标优化的改进极限学习机,将训练误差和输出层权值的均方最小化同时作为优化目标,采用带精英策略的快速非支配排序遗传算法对极限学习机的输入层到隐层的权值和阈值进行优化.将该算法应用于摇摆工况下自然循环系统不规则复合型流量脉动的多步滚动预测,分析了训练误差和输出层权值对不同步长预测效果的影响.仿真结果表明,优化极限学习机预测误差可以用较小的网络规模获得很好的泛化能力.为流动不稳定性的实时预测提供了一种准确度较高的途径,其预测结果可以作为核动力系统操作员的参考.  相似文献   

10.
马宗方  程咏梅  潘泉  王慧琴 《光子学报》2014,40(8):1220-1224
常用的图像型火焰探测算法是提取火焰在图像上表现出的单个特征信息或其有效组合作为识别的依据,需要大量的训练样本进行学习与参量优化,且识别率对特征选择的要求也很高.本文从火焰的整体特征考虑,提出了基于颜色模型和稀疏表示模型相结合的图像型火灾探测方法.首先在HIS空间建立颜色模型对火灾图像进行预处理提取出疑似区域,建立稀疏表示模型,并利用主成分分析方法构造火焰和疑似火焰物体的特征字典,最后利用l1-minimization计算测试样本与训练样本的最小逼近残差实现火焰和干扰物体的分类识别.实验结果表明,该方法提高了火灾图像的分类准确度和识别速度,同时具有较高的准确率.  相似文献   

11.
The university curriculum is a systematic and organic study complex with some immediate associated steps; the initial learning of each semester’s course is crucial, and significantly impacts the learning process of subsequent courses and further studies. However, the low teacher–student ratio makes it difficult for teachers to consistently follow up on the detail-oriented learning situation of individual students. The extant learning early warning system is committed to automatically detecting whether students have potential difficulties—or even the risk of failing, or non-pass reports—before starting the course. Previous related research has the following three problems: first of all, it mainly focused on e-learning platforms and relied on online activity data, which was not suitable for traditional teaching scenarios; secondly, most current methods can only proffer predictions when the course is in progress, or even approaching the end; thirdly, few studies have focused on the feature redundancy in these learning data. Aiming at the traditional classroom teaching scenario, this paper transforms the pre-class student performance prediction problem into a multi-label learning model, and uses the attribute reduction method to scientifically streamline the characteristic information of the courses taken and explore the important relationship between the characteristics of the previously learned courses and the attributes of the courses to be taken, in order to detect high-risk students in each course before the course begins. Extensive experiments were conducted on 10 real-world datasets, and the results proved that the proposed approach achieves better performance than most other advanced methods in multi-label classification evaluation metrics.  相似文献   

12.
周阳  周炎  周桃  任卉  石玲玲 《应用声学》2017,25(7):294-297
随着信息技术的高速发展,信息特征的表述方法和内涵不断扩充,高维特征大幅涌现。这些高维特征中可能存在许多不相关和冗余特征,造成了维度灾难,对分类识别算法提出了更高的要求,需要利用特征选择算法,降低特征向量维数并消除数据噪音的干扰。针对高维特征向量引入的维度灾难等问题,围绕目标分类识别的具体应用,对标准的序列浮动前向特征选择算法进行了研究,并通过优化正确分类样本数目的置信上限及交叉验证的重复次数,提出了一种改进的序列浮动前向特征选择算法。通过仿真实验表明,在利用贝叶斯分类器开展识别时,改进算法能够在确保分类识别正确率的前提下,有效提升特征选择的计算速度,并随着特征选择步骤的增加,能够维持一个相对更为收敛且稳定的置信区间,具备良好的准确度。  相似文献   

13.
高光谱图像具有波段连续、维数高、数据量大、相邻波段相关性强的特点,可为地物分类提供更为丰富的细节信息。但是,数据中存在大量冗余信息与噪声,在图像分类中如直接利用其所有波段特征而不进行有效分析与选择,将会导致较低的计算效率和较高的计算复杂度,分类精度亦可能随着波段维数增加而出现先增后减的“休斯(Hughes)现象”。为快速地从高达数十个甚至数百个波段的高光谱图像中提取出具有较好识别能力的特征子集,从而避免“维度灾难”,将过滤式ReliefF算法和封装式特征递归消除算法(RFE)相结合,构建了ReliefF-RFE特征选择算法,可用于高光谱图像分类的特征选择。该算法根据权重阈值,利用ReliefF算法快速剔除大量无关特征,缩小并优化特征子集的范围;利用RFE算法进一步搜索最优特征子集,将缩小范围后的特征子集中与分类器关联性小、冗余的特征进行递归筛选,进而得到分类性能最佳的特征子集。采用Indian pines数据集、Salinas-A数据集与KSC数据集等3个标准数据集作为实验数据,将ReliefF-RFE算法的应用效果与ReliefF和RFE算法进行对比。结果显示,在3个数据集中,应用ReliefF-RFE算法的高光谱图像分类平均总体精度(OA)为92.94%、F-measure为92.81%,Kappa系数为91.94%;ReliefF-RFE算法的平均特征维数是ReliefF算法的37%,而平均运算时间则是RFE算法的75%。由此表明,ReliefF-RFE算法能够在保证分类精度的同时,克服过滤式ReliefF算法无法有效减小特征之间冗余以及封装式RFE算法时间复杂度较高的缺陷,具有更为均衡的综合性能,适用于高光谱图像分类的特征选择。  相似文献   

14.
赵春晖  李彤  冯收 《光子学报》2021,50(3):148-158
针对常规的高光谱图像分类算法不能很好地解决不同图像中的频谱偏移的问题,提出了一种基于密集卷积和域自适应的高光谱图像分类算法,首先在源域中使用密集卷积进行深度特征学习,然后应用域自适应技术转移到目标域.目前的域自适应高光谱图像分类框架中常用卷积神经网络进行特征学习,但是当深度增加时会出现因梯度消失而导致分类精度下降的情况...  相似文献   

15.
The time complexity of the adaptive mean shift is related to the dimension of data and the number of iterations. The amount of computation will increase prohibitively with the increase of the data dimension. An approximate neighborhood queries method is presented for the computation of high dimensional data, in which, the locality-sensitive hashing (LSH) is used to reduce the computational complexity of the adaptive mean shift algorithm. The data-driven bandwidth selection for multivariate data is used in mean shift procedure, and an adaptive mean shift based on LSH with bandwidth estimation (LSH-PE-AMS) algorithm is proposed. Experimental results show that the proposed algorithm can reduce the complexity of the adaptive mean shift algorithm, and can produce a more accurate classification than the fixed bandwidth mean shift algorithm.  相似文献   

16.
牛丽红  倪国强 《光学技术》2005,31(3):420-423
来自多传感器的目标特征往往是高维数的,并且包含了更多的冗余信息和噪声。为了减小数据获取的代价,提高目标识别器的性能和效率,提出了基于遗传算法(GA)的多传感器目标识别系统特征优化方法。将遗传算法与神经网络目标分类器结合,通过识别结果的反馈信息,控制GA的遗传进化方向,从而实现特征优化。为了克服遗传算法的未成熟收敛问题,提出了相关选择与自适应遗传算子相结合的改进遗传算法。仿真实验结果验证了方法的有效性。  相似文献   

17.
Feature selection and feature extraction are the most important steps in classification and regression systems. Feature selection is commonly used to reduce the dimensionality of datasets with tens or hundreds of thousands of features, which would be impossible to process further. Recent example includes quantitative structure–activity relationships (QSAR) dataset including 1226 features. A major problem of QSAR is the high dimensionality of the feature space; therefore, feature selection is the most important step in this study. This paper presents a novel feature selection algorithm that is based on entropy. The performance of the proposed algorithm is compared with that of a genetic algorithm method and a stepwise regression method. The root mean square error of prediction in a QSAR study using entropy, genetic algorithm and stepwise regression using multiple linear regressions model for training set and test set were 0.3433, 0.3591 and 0.5500, 0.4326 and 0.6373, 0.6672, respectively.  相似文献   

18.
脂肪作为牛奶中的重要营养成分,是评价牛奶质量的一项重要指标。高光谱图像技术能够提供几十到数千波长的数据,能够反映牛奶中不同组成成分细微的光谱差异;另一方面,相邻波段之间往往具有很强的相关性,不仅增加了计算量,而且容易造成维数灾难等问题,因此对高光谱数据进行波段选择非常重要。工作中提出了PLS-ACO特征波段选择方法,并与遗传算法结合,组合成了PLS-ACO-GA的特征波段选择新方法。提出的两种方法以蚁群算法为基础,PLS回归模型回归系数的绝对值作为评价波长重要性的主要依据,以此作为蚁群算法的启发式信息,利用蚁群算法进行智能搜索,结合遗传算法,产生更多优秀的特征波段组合,避免PLS-ACO算法得到的只是局部最优解,得到的最优波段组合能够更好的反映牛奶中脂肪成分的信息;通过计算波长贡献率,筛选出最优波段组合,并与遗传算法,CARS算法和基本蚁群算法光谱特征选择方法比较,最后比较不同特征选择方法下的PLS回归模型预测效果。PLS-ACO, PLS-ACO-GA, CARS, GA和ACO分别筛选了牛奶样品光谱中的18,16,40,43和42个特征波段。其中PLS-ACO-GA筛选波段后的PLS预测模型效果最好,预测集R2p和RMSEP分别为0.997 6和0.062 2,PLS-ACO次之,预测集R2p和RMSEP分别为0.997 0和0.077 8。PLS-ACO和PLS-ACO-GA不仅减少了特征波段数量,而且提高了模型的精度。对PLS-ACO-GA进行特征波段选择后的数据,建立MLR,RFR和PLS回归预测模型。MLR预测模型的R2p和RMSEP分别为0.997 6和0.062 3。RFR回归模型R2p和RMSEP分别为0.999 9和0.003 0,PLS回归模型的R2p和RMSEP分别为0.997 6和0.062 2。RFR模型在三种回归预测模型中表现最好。研究结果表明PLS-ACO和PLS-ACO-GA这两种方法可以实现光谱数据特征波段选择,高光谱技术可以实现牛奶中脂肪含量的检测,为牛奶脂肪含量检测提供了一种新的、快速无损的方法。  相似文献   

19.
许多太赫兹光谱物质识别方法依靠寻找该物质在太赫兹波段范围内不同光谱表现出的不同特征来识别特定物质。吸收峰提取法是常用的光谱特征提取算法,但当光谱无明显特征吸收峰或峰位、峰值相近或难以识别时,难以利用吸收峰特征辨别物质。将机器学习和统计学习技术用于太赫兹光谱的识别中虽减少了吸收峰的干扰,但常常需要人为定义特征而导致分类误差。深度学习法能自动提取特征,但在识别前往往需要进行复杂的预处理操作,并且在特征提取的过程中容易丢失部分特征从而导致分类误差。针对以上问题,提出了一种基于小波系数图和卷积神经网络的太赫兹光谱识别方法。利用太赫兹光谱信号进行小波变换时,由于小波系数矩阵的每一行系数与原始光谱信号存在着对应关系,因此将太赫兹光谱的吸收系数通过小波变换在频率域上展开,能得到不同的二维的频率-尺度分布图,又称小波系数图。然后构造一个卷积神经网络(CNN)对小波系数图进行分类,可得到太赫兹光谱物质的分类结果。为了验证所提出算法的有效性,将三组小波系数图数据与原始光谱数据分别输入CNN、Support Vector Machin (SVM)、Multilayer Perceptron (MLP)三种不同的分类器作对比,从实验结果可以发现本文算法在三组数据中的识别率均达到了100%,说明相比于传统方法,本文方法能准确分类没有明显特征吸收峰的光谱,证明了使用卷积神经网络识别小波系数图的有效性。为了体现本文算法的优势,与小波脊线寻峰识别算法作对比,实验结果表明本文算法几乎不受峰频、峰位、峰值的影响,无论是识别不存在吸收峰的淀粉,还是识别相似度高的蔗糖和葡萄糖,都具有较高的识别率,分类准确率达97.62%,证明了所提算法的优越性。该算法为太赫兹光谱数据识别提供了一种新思路,同时也可以推广运用到其他谱图物质的识别中。  相似文献   

20.
对转炉炼钢终点的实时精准控制能够有效提高钢铁产出的质量,炉口火焰光谱在炼钢不同时期的变化明显,对其进行分析处理并与机器学习方法相结合可有效用于炼钢终点的实时控制。针对炉口火焰光谱数据量大、现有方法对光谱特征提取在可信度和实时性上不足的缺陷,提出一种基于窗口竞争性自适应重加权采样(WCARS)结合迭代式连续投影算法(ISPA)的光谱特征波长选择方法,该方法在有效解决模型过拟合问题的同时,能够降低高维数据计算的复杂度。将火焰光谱数据沿波长方向进行窗口划分后,使用CARS进行计算选出特征窗口波段,再将迭代式选择与传统连续投影算法相结合,通过重复迭代精选出特征波长,在此基础上使用支持向量机回归(SVR)建立炼钢终点碳含量预测模型。实验采集363组炼钢后期的炉口火焰光谱数据作为样本,并对其进行Savitzky-Golay平滑预处理。使用WCARS-ISPA算法从全光谱数据中选出10个特征波长作为SVR模型的输入,碳含量为模型输出,Kennard-stone算法对训练集和测试集进行划分,选择碳含量的平均预测误差、预测误差在±2%以内的命中率以及运行30次的平均时间作为模型评价指标。实验结果显示,模...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号