首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
2.
3.
4.
5.
Reliable determination of protein-protein interaction sites is of critical importance for structure-based design of small molecules modulating protein function through macromolecular interfaces. We present an alignment-free computational method for prediction of protein-protein interface residues. The method ("iPred") is based on a knowledge-based scoring function adapted from the field of protein folding and small molecule docking. Based on a training set of 394 hetero-dimeric proteins iPred achieves sustained accuracy on an external unbound test set. Prediction robustness was assessed from more than 1500 diverse complexes containing homo- and hetero-dimers. The technique does not rely on sequence conservation, so that rapid interface identification is possible even for proteins for which homologs are unknown or lack conserved residue patterns in interface region. Functional "hot-spot" residues are enriched among the predicted interface residues, rendering the method predestined for macromolecular binding site identification and drug design studies aiming at modulating protein-protein interaction that might influence protein function. For a comparative structural model of peptidase HtrA from Helicobacter pylori, we performed mutation studies for predicted hot-spot residues, which were confirmed as functionally relevant for HtrA activity or oligomerization.  相似文献   

6.
本文应用一种组合遗传算法和共轭梯度法的支持向量机(GA-CG-SVM)方法建立了药物诱导磷脂质病分类预测模型.首先对描述符进行了优化,选出了19个描述符用于模型的构建,所建模型对训练集的预测准确率为81.6%,对测试集的预测精度为87.5%,说明所建SVM分类模型不仅能正确预测训练集药物诱导的磷脂质病,也对其他化合物具...  相似文献   

7.
He J  Fang G  Deng Q  Wang S 《Analytica chimica acta》2011,704(1-2):57-62
The classification and regression trees (CART) possess the advantage of being able to handle large data sets and yield readily interpretable models. A conventional method of building a regression tree is recursive partitioning, which results in a good but not optimal tree. Ant colony system (ACS), which is a meta-heuristic algorithm and derived from the observation of real ants, can be used to overcome this problem. The purpose of this study was to explore the use of CART and its combination with ACS for modeling of melting points of a large variety of chemical compounds. Genetic algorithm (GA) operators (e.g., cross averring and mutation operators) were combined with ACS algorithm to select the best solution model. In addition, at each terminal node of the resulted tree, variable selection was done by ACS-GA algorithm to build an appropriate partial least squares (PLS) model. To test the ability of the resulted tree, a set of approximately 4173 structures and their melting points were used (3000 compounds as training set and 1173 as validation set). Further, an external test set containing of 277 drugs was used to validate the prediction ability of the tree. Comparison of the results obtained from both trees showed that the tree constructed by ACS-GA algorithm performs better than that produced by recursive partitioning procedure.  相似文献   

8.
The dG prediction accuracy by the Lead Finder docking software on the CSAR test set was characterized by R(2)=0.62 and rmsd=1.93 kcal/mol, and the method of preparation of the full-atom structures of the test set did not significantly affect the resulting accuracy of predictions. The primary factors determining the correlation between the predicted and experimental values were the van der Waals interactions and solvation effects. Those two factors alone accounted for R(2)=0.50. The other factors that affected the accuracy of predictions, listed in the order of decreasing importance, were the change of ligand's internal energy upon binding with protein, the electrostatic interactions, and the hydrogen bonds. It appears that those latter factors contributed to the independence of the prediction results from the method of full-atom structure preparation. Then, we turned our attention to the other factors that could potentially improve the scoring function in order to raise the accuracy of the dG prediction. It turned out that the ligand-centric factors, including Mw, cLogP, PSA, etc. or protein-centric factors, such as the functional class of protein, did not improve the prediction accuracy. Following that, we explored if the weak molecular interactions such as X-H...Ar, X-H...Hal, CO...Hal, C-H...X, stacking and π-cationic interactions (where X is N or O), that are generally of interest to the medicinal chemists despite their lack of proper molecular mechanical parametrization, could improve dG prediction. Our analysis revealed that out of these new interactions only CO...Hal is statistically significant for dG predictions using Lead FInder scoring function. Accounting for the CO...Hal interaction resulted in the reduction of the rmsd from 2.19 to 0.69 kcal/mol for the corresponding structures. The other weak interaction factors were not statistically significant and therefore irrelevant to the accuracy of dG prediction. On the basis of our findings from our participation in the CSAR scoring challenge we conclude that a significant increase of accuracy predictions necessitates breakthrough scoring approaches. We anticipate that the explicit accounting for water molecules, protein flexibility, and a more thermodynamically accurate method of dG calculation rather than single point energy calculation may lead to such breakthroughs.  相似文献   

9.
10.
11.
The predictive accuracy of the model is of the most concern for computational chemists in quantitative structure-activity relationship (QSAR) investigations. It is hypothesized that the model based on analogical chemicals will exhibit better predictive performance than that derived from diverse compounds. This paper develops a novel scheme called "clustering first, and then modeling" to build local QSAR models for the subsets resulted from clustering of the training set according to structural similarity. For validation and prediction, the validation set and test set were first classified into the corresponding subsets just as those of the training set, and then the prediction was performed by the relevant local model for each subset. This approach was validated on two independent data sets by local modeling and prediction of the baseline toxicity for the fathead minnow. In this process, hierarchical clustering was employed for cluster analysis, k-nearest neighbor for classification, and partial least squares for the model generation. The statistical results indicated that the predictive performances of the local models based on the subsets were much superior to those of the global model based on the whole training set, which was consistent with the hypothesis. This approach proposed here is promising for extension to QSAR modeling for various physicochemical properties, biological activities, and toxicities.  相似文献   

12.
During last few decades accurate determination of protein structural class using a fast and suitable computational method has been a challenging problem in protein science. In this context a meaningful representation of a protein sample plays a key role in achieving higher prediction accuracy. In this paper based on the concept of Chou's pseudo amino acid composition (Chou, K.C., 2001. Proteins 43, 246-255), a new feature representation method is introduced which is composed of the amino acid composition information, the amphiphilic correlation factors and the spectral characteristics of the protein. Thus the sample of a protein is represented by a set of discrete components which incorporate both the sequence order and the length effect. On the basis of such a statistical framework a simple radial basis function network based classifier is introduced to predict protein structural class. A set of exhaustive simulation studies demonstrates high success rate of classification using the self-consistency and jackknife test on the benchmark datasets.  相似文献   

13.
14.
15.
The Atomic Solvation Parameter (ASP) model is one of the simplest models of solvation, in which the solvation free energy of a molecule is proportional to the solvent accessible surface area (SAS) of its atoms. However, until now this model had not been incorporated into the Self-Consistent Mean Field Theory (SCMFT) method for modelling sidechain conformations in proteins. The reason for this is that SAS is a many-body quantity and, thus, it is not obvious how to define it within the Mean Field (MF) framework, where multiple copies of each sidechain exist simultaneously. Here, we present a method for incorporating an SAS-based potential, such as the ASP model, into SCMFT. The theory on which the method is based is exact within the MF framework, that is, it does not depend on a pairwise or any other approximation of SAS. Therefore, SAS can be calculated to arbitrary accuracy. The method is computationally very efficient: only 7.6% slower on average than the method without solvation. We applied the method to the prediction of sidechain conformation, using as a test set high-quality solution structures of 11 proteins. Solvation was found to substantially improve the prediction accuracy of well-defined surface sidechains. We also investigated whether the methodology can be applied to prediction of folding free energies of protein mutants, using a set of barnase mutants. For apolar mutants, the modest correlation observed between calculated and observed folding free energies without solvation improved substantially when solvation was included, allowing the prediction of trends in the folding free energies of this type of mutants. For polar mutants, correlation was not significant even with solvation. Several other factors also responsible for the correlation were identified and analysed. From this analysis, future directions for applying and improving the present methodology are discussed.  相似文献   

16.
17.
分别以支持向量机(SVM)和KStar方法为基础, 构建了代谢产物的分子形状判别和代谢反应位点判别的嵌套预测模型. 分子形状判别模型是以272个分子为研究对象, 计算了包括分子拓扑、二维自相关、几何结构等在内的1280个分子描述符, 考查了支持向量机、决策树、贝叶斯网络、k最近邻这四种机器学习方法建立分类预测模型的准确性. 结果表明, 支持向量机优于其他方法, 此模型可用于预测分子能否被细胞色素P450酶催化发生氧脱烃反应. 代谢反应位点判别模型以538个氧脱烃反应代谢位点为研究对象, 计算了表征原子能量、价态、电荷等26个量子化学特征, 比较了决策树、贝叶斯网络、KStar、人工神经网络建模的准确率. 结果显示, KStar模型的准确率、敏感性、专一性均在90%以上, 对分子形状判别模型筛选出的分子, 此模型能较好地判断出哪个C―O键发生断裂. 本文以15个代谢反应明确的中药分子为验证集, 验证模型准确性, 研究结果表明基于SVM和KStar的嵌套预测模型具有一定的准确性, 有助于开展中药分子氧脱烃代谢产物的预测研究.  相似文献   

18.
A simple, rapid, and reliable method for the determination of residual sulphonamide antibacterials (SAs) (sulfadiazine, sulfamerazine, sulfadimidine, sulfamethoxypiridazine, sulfisozole, sulfamonomethoxine, sulfamethoxazole, sulfisoxazole, sulfadimethoxine, and sulfaquinoxaline) in animal liver and kidney was developed using a combination of clean-up on a Bond Elut PSA cartridge and HPLC with UV detection. The SAs were extracted with ethyl acetate and then dissolved in 5 ml of 50 v/v% ethyl acetate-n-hexane after being evaporated to dryness. For clean-up of the crude sample, the resuspended extract was applied to a Bond Elut PAS (primary/secondary amine cartridge), and then SAs were eluted from the cartridge using 5 ml of 20 v/v% acetonitrile-0.05 M ammonium formate before being analysed by HPLC. Recoveries of the SAs at the levels of 0.5 and 0.1 microg/g were 70.8-98.2%, the rerative standard deviation were less than 7.0%, and the detection limits were 0.03 microg/g. The present analysis method of SAs in animal kidney and liver using HPLC with a clean-up procedure was demonstrated to be highly applicable to the direct LC-MS-MS analysis without any modification.  相似文献   

19.
为了预测人体免疫缺陷蛋白酶抑制剂的活性, 计算了表征分子的组成和拓扑特征的462个分子描述符, 用Kennard-Stone方法和随机方法进行了训练集和测试集设计, 用Monte Carlo 模拟退火方法进行变量筛选, 并分别用神经网络, 逻辑回归, k-近邻和支持向量学习机方法建立了HIV-1蛋白酶的抑制剂模型. 结果表明支持向量学习机优于其余机器学习方法, 用SVM方法所建立的最优模型的最后预测正确率达到98.24%.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号