首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
用"相对熵"作为优化函数,提出了一个有效快速的折叠预测优化算法.使用了非格点模型,预测只关心蛋白质主链的走向.其中只用到了蛋白质主链上的两两连续的Cα原子间的距离信息以及20种氨基酸的接触势的一个扩展形式.对几个真实蛋白质做了算法测试,预测的初始结构都为比较大的去折叠态,预测构象相对于它们天然结构的均方根偏差(RMSD)为5~7 A.从原理上讲,该方法是对能量优化的改进.  相似文献   

2.
Predicting protein structures from their amino acid sequences is a problem of global optimization. Global optima (native structures) are often sought using stochastic sampling methods such as Monte Carlo or molecular dynamics, but these methods are slow. In contrast, there are fast deterministic methods that find near-optimal solutions of well-known global optimization problems such as the traveling salesman problem (TSP). But fast TSP strategies have yet to be applied to protein folding, because of fundamental differences in the two types of problems. Here, we show how protein folding can be framed in terms of the TSP, to which we apply a variation of the Durbin-Willshaw elastic net optimization strategy. We illustrate using a simple model of proteins with database-derived statistical potentials and predicted secondary structure restraints. This optimization strategy can be applied to many different models and potential functions, and can readily incorporate experimental restraint information. It is also fast; with the simple model used here, the method finds structures that are within 5-6 A all-Calpha-atom RMSD of the known native structures for 40-mers in about 8 s on a PC; 100-mers take about 20 s. The computer time tau scales as tau approximately n, where n is the number of amino acids. This method may prove to be useful for structure refinement and prediction.  相似文献   

3.
As several structural proteomic projects are producing an increasing number of protein structures with unknown function, methods that can reliably predict protein functions from protein structures are in urgent need. In this paper, we present a method to explore the clustering patterns of amino acids on the 3-dimensional space for protein function prediction. First, amino acid residues on a protein structure are clustered into spatial groups using hierarchical agglomerative clustering, based on the distance between them. Second, the protein structure is represented using a graph, where each node denotes a cluster of amino acids. The nodes are labeled with an evolutionary profile derived from the multiple alignment of homologous sequences. Then, a shortest-path graph kernel is used to calculate similarities between the graphs. Finally, a support vector machine using this graph kernel is used to train classifiers for protein function prediction. We applied the proposed method to two separate problems, namely, prediction of enzymes and prediction of DNA-binding proteins. In both cases, the results showed that the proposed method outperformed other state-of-the-art methods.  相似文献   

4.
SPICKER: a clustering approach to identify near-native protein folds   总被引:2,自引:0,他引:2  
We have developed SPICKER, a simple and efficient strategy to identify near-native folds by clustering protein structures generated during computer simulations. In general, the most populated clusters tend to be closer to the native conformation than the lowest energy structures. To assess the generality of the approach, we applied SPICKER to 1489 representative benchmark proteins 相似文献   

5.
The energy‐based refinement of protein structures generated by fold prediction algorithms to atomic‐level accuracy remains a major challenge in structural biology. Energy‐based refinement is mainly dependent on two components: (1) sufficiently accurate force fields, and (2) efficient conformational space search algorithms. Focusing on the latter, we developed a high‐resolution refinement algorithm called GRID. It takes a three‐dimensional protein structure as input and, using an all‐atom force field, attempts to improve the energy of the structure by systematically perturbing backbone dihedrals and side‐chain rotamer conformations. We compare GRID to Backrub, a stochastic algorithm that has been shown to predict a significant fraction of the conformational changes that occur with point mutations. We applied GRID and Backrub to 10 high‐resolution (≤ 2.8 Å) crystal structures from the Protein Data Bank and measured the energy improvements obtained and the computation times required to achieve them. GRID resulted in energy improvements that were significantly better than those attained by Backrub while expending about the same amount of computational resources. GRID resulted in relaxed structures that had slightly higher backbone RMSDs compared to Backrub relative to the starting crystal structures. The average RMSD was 0.25 ± 0.02 Å for GRID versus 0.14 ± 0.04 Å for Backrub. These relatively minor deviations indicate that both algorithms generate structures that retain their original topologies, as expected given the nature of the algorithms. © 2012 Wiley Periodicals, Inc.  相似文献   

6.
We present a docking method that uses a scoring function for protein-ligand docking that is designed to maximize the docking success rate for low-resolution protein structures. We find that the resulting scoring function parameters are very different depending on whether they were optimized for high- or low-resolution protein structures. We show that this docking method can be successfully applied to predict the ligand-binding site of low-resolution structures. For a set of 25 protein-ligand complexes, in 76% of the cases, more than 50% of ligand-contacting residues are correctly predicted (using receptor crystal structures where the binding site is unspecified). Using decoys of the receptor structures having a 4 A RMSD from the native structure, for the same set of complexes, in 72% of the cases, we obtain at least one correctly predicted ligand-contacting residue. Furthermore, using an 81-protein-ligand set described by Jain, in 76 (93.8%) cases, the algorithm correctly predicts more than 50% of the ligand-contacting residues when native protein structures are used. Using 3 A RMSD from native decoys, in all but two cases (97.5%), the algorithm predicts at least one ligand-binding residue correctly. Finally, compared to the previously published Dolores method, for 298 protein-ligand pairs, the number of cases in which at least half of the specific contacts are correctly predicted is more than four times greater.  相似文献   

7.
Summary Evolutionary computing is a general optimization mechanism successfully implemented for a variety of numeric problems in a variety of fields, including structural biology. We here present an evolutionary approach to optimize helix stability in peptides and proteins employing the AGADIR energy function for helix stability as scoring function. With the ability to apply masks determining positions, which are to remain constant or fixed to a certain class of amino acids, our algorithm is capable of developing stable helical scaffolds containing a wide variety of structural and functional amino acid patterns. The algorithm showed good convergence behaviour in all tested cases and can be parameterized in a wide variety of ways. We have applied our algorithm for the optimization of the stability of prion protein helix 1, a structural element of the prion protein which is thought to play a crucial role in the conformational transition from the cellular to the pathogenic form of the prion protein, and which therefore poses an interesting target for pharmacological as well as genetic engineering approaches to counter the as of yet uncurable prion diseases. NMR spectroscopic investigations of selected stabilizing and destabilizing mutations found by our algorithm could demonstrate its ability to create stabilized variants of secondary structure elements.  相似文献   

8.
Computational protein design depends on an energy function and an algorithm to search the sequence/conformation space. We compare three stochastic search algorithms: a heuristic, Monte Carlo (MC), and a Replica Exchange Monte Carlo method (REMC). The heuristic performs a steepest‐descent minimization starting from thousands of random starting points. The methods are applied to nine test proteins from three structural families, with a fixed backbone structure, a molecular mechanics energy function, and with 1, 5, 10, 20, 30, or all amino acids allowed to mutate. Results are compared to an exact, “Cost Function Network” method that identifies the global minimum energy conformation (GMEC) in favorable cases. The designed sequences accurately reproduce experimental sequences in the hydrophobic core. The heuristic and REMC agree closely and reproduce the GMEC when it is known, with a few exceptions. Plain MC performs well for most cases, occasionally departing from the GMEC by 3–4 kcal/mol. With REMC, the diversity of the sequences sampled agrees with exact enumeration where the latter is possible: up to 2 kcal/mol above the GMEC. Beyond, room temperature replicas sample sequences up to 10 kcal/mol above the GMEC, providing thermal averages and a solution to the inverse protein folding problem. © 2016 Wiley Periodicals, Inc.  相似文献   

9.
10.
Development of protein 3-D structural comparison methods is essential for understanding protein functions. Some amino acids share structural similarities while others vary considerably. These structures determine the chemical and physical properties of amino acids. Grouping amino acids with similar structures potentially improves the ability to identify structurally conserved regions and increases the global structural similarity between proteins. We systematically studied the effects of amino acid grouping on the numbers of Specific/specific, Common/common, and statistically different keys to achieve a better understanding of protein structure relations. Common keys represent substructures found in all types of proteins and Specific keys represent substructures exclusively belonging to a certain type of proteins in a data set. Our results show that applying amino acid grouping to the Triangular Spatial Relationship (TSR)-based method, while computing structural similarity among proteins, improves the accuracy of protein clustering in certain cases. In addition, applying amino acid grouping facilitates the process of identification or discovery of conserved structural motifs. The results from the principal component analysis (PCA) demonstrate that applying amino acid grouping captures slightly more structural variation than when amino acid grouping is not used, indicating that amino acid grouping reduces structure diversity as predicted. The TSR-based method uniquely identifies and discovers binding sites for drugs or interacting proteins. The binding sites of nsp16 of SARS-CoV-2, SARS-CoV and MERS-CoV that we have defined will aid future antiviral drug design for improving therapeutic outcome. This approach for incorporating the amino acid grouping feature into our structural comparison method is promising and provides a deeper insight into understanding of structural relations of proteins.  相似文献   

11.
We have developed a generic evolutionary method with an empirical scoring function for the protein-ligand docking, which is a problem of paramount importance in structure-based drug design. This approach, referred to as the GEMDOCK (Generic Evolutionary Method for molecular DOCKing), combines both continuous and discrete search mechanisms. We tested our approach on seven protein-ligand complexes, and the docked lowest energy structures have root-mean-square derivations ranging from 0.32 to 0.99 A with respect to the corresponding crystal ligand structures. In addition, we evaluated GEMDOCK on crossdocking experiments, in which some complexes with an identical protein used for docking all crystallized ligands of these complexes. GEMDOCK yielded 98% docked structures with RMSD below 2.0 A when the ligands were docked into foreign protein structures. We have reported the validation and analysis of our approach on various search spaces and scoring functions. Experimental results show that our approach is robust, and the empirical scoring function is simple and fast to recognize compounds. We found that if GEMDOCK used the RMSD scoring function, then the prediction accuracy was 100% and the docked structures had RMSD below 0.1 A for each test system. These results suggest that GEMDOCK is a useful tool, and may systematically improve the forms and parameters of a scoring function, which is one of major bottlenecks for molecular recognition.  相似文献   

12.
Molecular docking predicts the best pose of a ligand in the target protein binding site by sampling and scoring numerous conformations and orientations of the ligand. Failures in pose prediction are often due to either insufficient sampling or scoring function errors. To improve the accuracy of pose prediction by tackling the sampling problem, we have developed a method of pose prediction using shape similarity. It first places a ligand conformation of the highest 3D shape similarity with known crystal structure ligands into protein binding site and then refines the pose by repacking the side-chains and performing energy minimization with a Monte Carlo algorithm. We have assessed our method utilizing CSARdock 2012 and 2014 benchmark exercise datasets consisting of co-crystal structures from eight proteins. Our results revealed that ligand 3D shape similarity could substitute conformational and orientational sampling if at least one suitable co-crystal structure is available. Our method identified poses within 2 Å RMSD as the top-ranking pose for 85.7 % of the test cases. The median RMSD for our pose prediction method was found to be 0.81 Å and was better than methods performing extensive conformational and orientational sampling within target protein binding sites. Furthermore, our method was better than similar methods utilizing ligand 3D shape similarity for pose prediction.  相似文献   

13.
Despite recent advances in fold recognition algorithms that identify template structures with distant homology to the target sequence, the quality of the target-template alignment can be a major problem for distantly related proteins in comparative modeling. Here we report for the first time on the use of ensembles of pairwise alignments obtained by stochastic backtracking as a means to improve three-dimensional comparative protein models. In every one of the 35 cases, the ensemble produced by the program probA resulted in alignments that were closer to the structural alignment than those obtained from the optimal alignment. In addition, we examined the lowest energy structure among these ensembles from four different structural assessment methods and compared these with the optimal and structural alignment model. The structural assessment methods consisted of the DFIRE, DOPE, and ProsaII statistical potential energies and the potential energy from the CHARMM protein force field coupled to a Generalized Born implicit solvent model. The results demonstrate that the generation of alignment ensembles through stochastic backtracking using probA combined with one of the statistical potentials for assessing three-dimensional structures can be used to improve comparative models.  相似文献   

14.
Neutralizing antibodies often recognize conformational, discontinuous epitopes. Linear peptides mimicking such conformational epitopes can be selected from phage display peptide libraries by screening with the respective antibodies. However, it is difficult to localize these "mimotopes" within the three-dimensional (3D) structures of the target proteins. Knowledge of conformational epitopes of neutralizing antibodies would help to design antigens able to elicit protective immune responses. Therefore, we provide here a software that allows to localize linear peptide sequences within 3D structures of proteins. The 3D-Epitope-Explorer (3DEX) software allows to map conformational epitopes in 3D protein structures based on an algorithm that takes into account the physicochemical neighborhood of C(alpha)- or C(beta)-atoms of individual amino acids. A given amino acid of a peptide sequence is localized within the protein and the software searches within predefined distances for the amino acids neighboring that amino acid in the peptide. Surface exposure of the amino acids can also be taken into consideration. The procedure is then repeated for the remaining amino acids of the peptide. The introduction of a joker function allows to map peptide mimotopes, which do not necessarily have 100% sequence homology to the protein. Using this software we were able to localize mimotopes selected from phage displayed peptide libraries with polyclonal antibodies from HIV-positive patient plasma within the 3D structure of gp120, the exterior glycoprotein of HIV-1. We also analyzed two recently published peptide sequences corresponding to known conformational epitopes to further confirm the integrity of 3DEX.  相似文献   

15.
利用机器学习方法对单个氨基酸突变引起的蛋白质稳定性变化进行精确地预测,对蛋白质的结构和功能方面的研究具有重要的价值,并且对设计新的蛋白质及蛋白质工程学具有一定的指导意义.通过对蛋白质网络拓扑特征的研究,发现网络拓扑特征对于蛋白质突变稳定性影响具有较高的准确率.基于蛋白质网络拓扑特征的随机森林算法,能较好的对蛋白质单点突...  相似文献   

16.
De novo design of artificial proteins is an essential approach to elucidate the principles of protein architecture and to understand specific functions of natural proteins and also to yield novel molecules for medical and industrial aims. We have designed artificial sequences of 153 amino acids to fit the main-chain framework of the sperm whale myoglobin structure based on the knowledge-based energy functions to evaluate the compatibility between protein tertiary structures and amino acid sequences. The synthesized artificial globins bind a single heme per protein molecule as designed, which show well-defined electrochemical and spectroscopic features characteristic of proteins with a low-spin heme. Redox and ligand binding reactions of the artificial heme proteins were investigated and these heme-related functions were found to vary with their structural uniqueness. Relationships between the structural and functional properties are discussed.  相似文献   

17.
In this paper are reported the local minimum problem by means of current greedy algorithm for training the empirical potential function of protein folding on 8623 non-native structures of 31 globular proteins and a solution of the problem based upon the simulated annealing algorithm. This simulated annealing algorithm is indispensable for developing and testing highly refined empirical potential functions.  相似文献   

18.
Prediction of protein loop conformations without any prior knowledge (ab initio prediction) is an unsolved problem. Its solution will significantly impact protein homology and template‐based modeling as well as ab initio protein‐structure prediction. Here, we developed a coarse‐grained, optimized scoring function for initial sampling and ranking of loop decoys. The resulting decoys are then further optimized in backbone and side‐chain conformations and ranked by all‐atom energy scoring functions. The final integrated technique called loop prediction by energy‐assisted protocol achieved a median value of 2.1 Å root mean square deviation (RMSD) for 325 12‐residue test loops and 2.0 Å RMSD for 45 12‐residue loops from critical assessment of structure‐prediction techniques (CASP) 10 target proteins with native core structures (backbone and side chains). If all side‐chain conformations in protein cores were predicted in the absence of the target loop, loop‐prediction accuracy only reduces slightly (0.2 Å difference in RMSD for 12‐residue loops in the CASP target proteins). The accuracy obtained is about 1 Å RMSD or more improvement over other methods we tested. The executable file for a Linux system is freely available for academic users at http://sparks‐lab.org . © 2013 Wiley Periodicals, Inc.  相似文献   

19.
蛋白质折叠类型的分类建模与识别   总被引:2,自引:0,他引:2  
刘岳  李晓琴  徐海松  乔辉 《物理化学学报》2009,25(12):2558-2564
蛋白质的氨基酸序列如何决定空间结构是当今生命科学研究中的核心问题之一. 折叠类型反映了蛋白质核心结构的拓扑模式, 折叠识别是蛋白质序列-结构研究的重要内容. 我们以占Astral 1.65序列数据库中α, β和α/β三类蛋白质总量41.8%的36个无法独立建模的折叠类型为研究对象, 选取其中序列一致性小于25%的样本作为训练集, 以均方根偏差(RMSD)为指标分别进行系统聚类, 生成若干折叠子类, 并对各子类建立基于多结构比对算法(MUSTANG)结构比对的概形隐马尔科夫模型(profile-HMM). 将Astral 1.65中序列一致性小于95%的9505个样本作为检验集, 36个折叠类型的平均识别敏感性为90%, 特异性为99%, 马修斯相关系数(MCC)为0.95. 结果表明: 对于成员较多, 无法建立统一模型的折叠类型, 基于RMSD的系统分类建模均可实现较高准确率的识别, 为蛋白质折叠识别拓展了新的方法和思路, 为进一步研究奠定了基础.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号