首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 250 毫秒
1.
韩国胜  喻祖国  Anh Vo 《中国物理 B》2011,20(10):100504-100504
Apoptosis proteins play an important role in the development and homeostasis of an organism. The elucidation of the subcellular locations and functions of these proteins is helpful for understanding the mechanism of programmed cell death. In this paper, the recurrent quantification analysis, Hilbert-Huang transform methods, the maximum relevance and minimum redundancy method and support vector machine are used to predict the subcellular location of apoptosis proteins. The validation of the jackknife test suggests that the proposed method can improve the prediction accuracy of the subcellular location of apoptosis proteins and its application may be promising in other fields.  相似文献   

2.
Mycobacterium tuberculosis is the primary pathogen causing tuberculosis, which is one of the most prevalent infectious diseases. The subcellular location of mycobacterial proteins can provide essential clues for proteins function research and drug discovery. Therefore, it is highly desirable to develop a computational method for fast and reliable prediction of subcellular location of mycobacterial proteins. In this study, we developed a support vector machine (SVM) based method to predict subcellular location of mycobacterial proteins. A total of 444 non-redundant mycobacterial proteins were used to train and test proposed model by using jackknife cross validation. By selecting traditional pseudo amino acid composition (PseAAC) as parameters, the overall accuracy of 83.3% was achieved. Moreover, a feature selection technique was developed to find out an optimal amount of PseAAC for improving predictive performance. The optimal amount of PseAAC improved overall accuracy from 83.3 to 87.2%. In addition, the reduced amino acids in N-terminus and non N-terminus of proteins were combined in models for further improving predictive successful rate. As a result, the maximum overall accuracy of 91.2% was achieved with average accuracy of 79.7%. The proposed model provides highly useful information for further experimental research. The prediction model can be accessed free of charge at .  相似文献   

3.
One of the central problems in computational biology is protein function identification in an automated fashion. A key step to achieve this is predicting to which subcellular location the protein belongs, since protein localization correlates closely with its function. A wide variety of methods for protein subcellular localization prediction have been proposed over recent years. Linear dimensionality reduction (DR) methods have been introduced to address the high-dimensionality problem by transforming the representation of protein sequences. However, this approach is not suitable for some complex biological systems that have nonlinear characteristics. Herein, we use nonlinear DR methods such as the kernel DR method to capture the nonlinear characteristics of a high-dimensional space. Then, the K-nearest-neighbor (K-NN) classifier is employed to identify the subcellular localization of Gram-negative bacterial proteins based on their reduced low-dimensional features. Experimental results thus obtained are quite encouraging, indicating that the applied nonlinear DR method is effective to deal with this complicated problem of predicting subcellular localization of Gram-negative bacterial proteins. An online web server for predicting subcellular location of Gram-negative bacterial proteins is available at .  相似文献   

4.
Protein’s subcellular location, which indicates where a protein resides in a cell, is an important characteristic of protein. Correctly assigning proteins to their subcellular locations would be of great help to the prediction of proteins’ function, genome annotation, and drug design. Yet, in spite of great technical advance in the past decades, it is still time-consuming and laborious to experimentally determine protein subcellular locations on a high throughput scale. Hence, four integrated-algorithm methods were developed to fulfill such high throughput prediction in this article. Two data sets taken from the literature (Chou and Elrod, Protein Eng 12:107–118, 1999) were used as training set and test set, which consisted of 2,391 and 2,598 proteins, respectively. Amino acid composition was applied to represent the protein sequences. The jackknife cross-validation was used to test the training set. The final best integrated-algorithm predictor was constructed by integrating 10 algorithms in Weka (a software tool for tackling data mining tasks, ) based on an mRMR (Minimum Redundancy Maximum Relevance, ) method. It can achieve correct rate of 77.83 and 80.56% for the training set and test set, respectively, which is better than all of the 60 algorithms collected in Weka. This predicting software is available upon request.  相似文献   

5.
基于AdaBoost算法的癫痫脑电信号识别   总被引:1,自引:0,他引:1       下载免费PDF全文
张涛  陈万忠  李明阳 《物理学报》2015,64(12):128701-128701
AdaBoost算法作为Boosting算法的经典算法之一, 在人脸检测和目标跟踪等领域得到了广泛应用, 但该算法也有一个缺点-退化问题. 为了解决这个问题, 通过对弱分类器进行筛选、引入平滑因子和权值修正函数三个措施对算法进行优化, 并将优化后的算法与小波包分解相结合应用到癫痫脑电信号的识别上. 结果表明, 本文算法对癫痫脑电信号的识别率为96.11%, 对正常脑电信号的识别率为99.51%, 具有较高的识别率, 为癫痫的正确诊断提供了一种可能有效的解决方案.  相似文献   

6.
利用光谱信息快速、无损和准确的检测水稻冠层叶片叶绿素含量,对水稻的长势评估、精准施肥、科学管理都具有非常重要的现实意义。以东北粳稻为研究对象,以小区试验为基础,获取关键生长期的水稻冠层高光谱数据。首先采用标准正态变量校正法(SNV)对光谱数据进行预处理,针对处理后光谱数据,以随机蛙跳(RF)算法为基础,结合相关系数分析法(CC)和续投影算法(SPA),提出一种融合两种初选波段的改进型随机蛙跳算法(fpb-RF)筛选叶绿素含量的特征波段,并分别与标准RF,CC 和SPA方法进行对比。以提取的特征波段作为输入,结合线性模型和非线性模型各自优势,提出一种高斯过程回归(GPR)补偿偏最小二乘(PLSR)的叶绿素含量混合预测模型(GPR-P):利用PLSR法对水稻叶绿素含量初步预测,得到叶绿素含量的线性趋势,然后利用具有较好非线性逼近能力的GPR对PLSR模型偏差进行预测,两者叠加得到最终预测值。为了验证所提方法优越性,以不同方法提取的特征波段作为输入,分别建立PLSR、最小二乘支持向量机(LSSVM)、BP神经网络预测模型。结果表明:相同预测模型条件下,改进fpb-RF算法提取特征波段作为输入可较好的降低模型复杂性、提高模型预测性能,各模型测试集的决定系数(R2P)和训练集的决定系数(R2C)均高于0.704 7。另外,在各算法提取特征波段进行建模时,GPR-P模型的R2CR2P均高于0.755 3,其中,采用fpb-RF方法提取的特征波段作为输入建立的GPR-P模型预测精度最高,R2CR2P分别为 0.781 5和0.779 6,RMSEC和RMSEP分别为0.904 1和0.928 3 mg·L-1,可为东北粳稻叶绿素含量的检测与评估提供有价值的参考和借鉴作用。  相似文献   

7.
The knowledge of whether one enzyme can interact with a small molecule is essential for understanding the molecular and cellular functions of organisms. In this paper, we introduce a classifier to predict the small molecule– enzyme interaction, i.e., whether they can interact with each other. Small molecules are represented by their chemical functional groups, and enzymes are represented by their biochemical and physicochemical properties, resulting in a total of 160 features. These features are input into the AdaBoost classifier, which is known to have good generalization ability to predict interaction. As a result, the overall prediction accuracy, tested by tenfold cross-validation and independent sets, is 81.76% and 83.35%, respectively, suggesting that this strategy is effective. In this research, we typically choose interactions between small molecules and enzymes involved in metabolism to ultimately improve further understanding of metabolic pathways. An online predictor developed by this research is available at . Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

8.
In classical machine learning,a set of weak classifiers can be adaptively combined for improving the overall performance,a technique called adaptive boosting(or AdaBoost).However,constructing a combined classifier for a large data set is typically resource consuming.Here we propose a quantum extension of AdaBoost,demonstrating a quantum algorithm that can output the optimal strong classifier with a quadratic speedup in the number of queries of the weak classifiers.Our results also include a generalization of the standard AdaBoost to the cases where the output of each classifier may be probabilistic.We prove that the query complexity of the non-deterministic classifiers is the same as those of deterministic classifiers,which may be of independent interest to the classical machine-learning community.Additionally,once the optimal classifier is determined by our quantum algorithm,no quantum resources are further required.This fact may lead to applications on near term quantum devices.  相似文献   

9.
土壤重金属元素含量检测及防治,对我国农业、生态环境修复具有重大意义。利用外加腔体约束结合激光诱导击穿光谱技术(LIBS)获得土壤光谱数据,采用机器学习对土壤中重金属元素Ni和Ba含量进行分析。实验设置延迟时间为0.5~5 μs,选择Ni Ⅱ 221.648 nm和Ba Ⅱ 495.709 nm作为目标研究特征谱线,计算两种LIBS条件下延迟时间对信噪比、光谱强度及增强因子的影响。结果表明,腔体约束LIBS(CC-LIBS)可以增大光谱强度及目标元素信噪比,同时随着采集延迟时间增长,等离子体数目变少,光谱强度及信噪比逐渐减小并趋于稳定;当延迟时间设置为1 μs时,CC-LIBS条件下Ni和Ba元素特征谱线信噪比达到最优,确定此时为LIBS最优实验条件。通过最优条件获取9种含Ni和Ba元素土壤样品的光谱数据,由于采集到的每组光谱信息有12 248个数据点,利用主成分分析(PCA)对CC-LIBS条件下的光谱数据降维,在保留95%以上的土壤原始信息后,选择9个主成分作为定量分析模型的输入变量,以提高模型的运算速度。采用机器学习中的Lasso,AdaBoost和Random Forest模型,对PCA降维后的光谱数据进行建模及预测,实现土壤重金属元素Ni和Ba的定量分析。结果表明,与Lasso和AdaBoost模型相比,Random Forest模型在训练集和测试集中表现出的预测性能最优。Random Forest模型下Ni元素在测试集中的R2为0.937,RMSEP为3.037;Ba元素在测试集中的相关系数R2为0.886,均方根误差RMSEP为90.515。基于腔体约束LIBS技术结合机器学习,为土壤重金属元素的高精度检测提供了技术指导。  相似文献   

10.
Pengli Lu 《中国物理 B》2022,31(11):118901-118901
Predicting essential proteins is crucial for discovering the process of cellular organization and viability. We propose biased random walk with restart algorithm for essential proteins prediction, called BRWR. Firstly, the common process of practice walk often sets the probability of particles transferring to adjacent nodes to be equal, neglecting the influence of the similarity structure on the transition probability. To address this problem, we redefine a novel transition probability matrix by integrating the gene express similarity and subcellular location similarity. The particles can obtain biased transferring probabilities to perform random walk so as to further exploit biological properties embedded in the network structure. Secondly, we use gene ontology (GO) terms score and subcellular score to calculate the initial probability vector of the random walk with restart. Finally, when the biased random walk with restart process reaches steady state, the protein importance score is obtained. In order to demonstrate superiority of BRWR, we conduct experiments on the YHQ, BioGRID, Krogan and Gavin PPI networks. The results show that the method BRWR is superior to other state-of-the-art methods in essential proteins recognition performance. Especially, compared with the contrast methods, the improvements of BRWR in terms of the ACC results range in 1.4%-5.7%, 1.3%-11.9%, 2.4%-8.8%, and 0.8%-14.2%, respectively. Therefore, BRWR is effective and reasonable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号