首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Graphical representation of molecular conformations is an important tool used by chemists to gain molecular insight. In spite of today's enhanced computer graphics there are still situations, such as in multiple conformation displays, in which standard visualization techniques are limited. Parallel-coordinate (‖-coords) representation, which was originally developed for visualizing multivariant datasets in fields other than chemistry, offers an alternative basis for graphical representation of molecular structures. In parallel-coordinates, the axes are drawn parallel rather than perpendicular to each other, allowing many axes to be placed and seen. This mapping procedure has unique geometric properties and useful relationships to the original space. In this article, we apply the parallel-coordinate representation for presenting peptide and protein structural conformations. In particular, we demonstrate the usefulness of parallel-coordinates in the context of conformational analysis where this representation, combined with multiple filters, allows nontrivial clustering of data points, leading to new observations. The ‖-coords representation is also demonstrated as a tool for two-dimensional (2D) representation of protein secondary structure and for identification of disulfide-bonded pairs in protein structures. Regardless of the application, an advantage of the ‖-coords approach is that it retains its inherent simplicity and ease of use, and requires little or no software development. © 1997 John Wiley & Sons, Inc. J Comput Chem 18 : 1893–1902, 1997  相似文献   

2.
The agglomerative clustering methods and the tests usually applied to evaluate the significance of clusters are critically evaluated. Many clustering techniques can provide erroneous information about the existence of clusters. The single linkage technique is suggested to identify natural, well separated, clusters. The existing statistical tests on the significance of clusters are not satisfactory. A new statistical test, based on the distribution of the distances between the objects and their first nearest neighbor, is presented. The performances of the test are compared with those of the Sneath test and of the variance-ratio test on some artificial and real data sets.  相似文献   

3.
Two traditional clustering algorithms are applied to configurations from a long molecular dynamics trajectory and compared using two sets of test data. First, a subset of atoms was chosen to present conformations which naturally fall into a number of clusters. Second, a subset of atoms was selected to span a relatively continuous region of conformational space rather than form discrete conformational classes. Of the two algorithms used, the single linkage method is inappropriate for this kind of data. The divisive hierarchical method, based on minimizing the difference between cluster centroids and extrema, is successful but also prone to imposing clustering hierarchy where none can be justified. © 1994 by John Wiley & Sons, Inc.  相似文献   

4.
We present an efficient density‐based adaptive‐resolution clustering method APLoD for analyzing large‐scale molecular dynamics (MD) trajectories. APLoD performs the k‐nearest‐neighbors search to estimate the density of MD conformations in a local fashion, which can group MD conformations in the same high‐density region into a cluster. APLoD greatly improves the popular density peaks algorithm by reducing the running time and the memory usage by 2–3 orders of magnitude for systems ranging from alanine dipeptide to a 370‐residue Maltose‐binding protein. In addition, we demonstrate that APLoD can produce clusters with various sizes that are adaptive to the underlying density (i.e., larger clusters at low‐density regions, while smaller clusters at high‐density regions), which is a clear advantage over other popular clustering algorithms including k‐centers and k‐medoids. We anticipate that APLoD can be widely applied to split ultra‐large MD datasets containing millions of conformations for subsequent construction of Markov State Models. © 2016 Wiley Periodicals, Inc.  相似文献   

5.
Clustering methods have been widely used to group together similar conformational states from molecular simulations of biomolecules in solution. For applications such as the interaction of a protein with a surface, the orientation of the protein relative to the surface is also an important clustering parameter because of its potential effect on adsorbed‐state bioactivity. This study presents cluster analysis methods that are specifically designed for systems where both molecular orientation and conformation are important, and the methods are demonstrated using test cases of adsorbed proteins for validation. Additionally, because cluster analysis can be a very subjective process, an objective procedure for identifying both the optimal number of clusters and the best clustering algorithm to be applied to analyze a given dataset is presented. The method is demonstrated for several agglomerative hierarchical clustering algorithms used in conjunction with three cluster validation techniques. © 2016 Wiley Periodicals, Inc.  相似文献   

6.
The applicability of potential functions in unsupervised pattern recognition is demonstrated on the basis of a new clustering technique called CLUPOT. CLUPOT is a centrotype sorting technique which means that for each detected cluster of objects a representative object can be selected. CLUPOT uses a reliability curve which permits the detection of significant clusters. Applications to four data sets (Kowalski's archeological artefact data, Ruspini's fuzzy set data. Fisher's Iris data and a part of Esbensen's meteorite data) show that CLUPOT yields significant clusterings.  相似文献   

7.
This paper describes the first application of fuzzy c-means clustering for the selection of representatives from assemblies of conformations or alignments. In case of alignments, their quality is taken into account using a weighted c-means scheme, developed in this work. The performance of fuzzy cluster validity measures, such as compactness, partition function, and entropy, are studied on several examples, but the visual 3D representation of data points is shown to be most beneficial in determining the optimum number of clusters. Fuzzy clustering is expected to perform better than crisp clustering methods in cases where there are a significant number of "outliers", such as in molecular dynamics simulations and molecular alignments.  相似文献   

8.
This work describes the application of a Bayesian method for clustering protein conformations sampled during a molecular dynamics simulation of the HIV-1 integrase catalytic core. A clustering analysis is carried out under the assumption of normal distribution without fixing the number of clusters in advance. Some performance measures, such as posterior probability and class cross entropy, are used to determine the most probable set of clusters. The Bayesian clustering method results in meaningful groups identifying transitions between conformational ensembles. The dihedral angles involved in such transitions are also examined in detail. The conformations in high dimensional space are projected into 3D space employing a multidimensional scaling technique to provide a visual inspection.  相似文献   

9.
We present a new strategy for analyzing imaging time‐of‐flight SIMS data sets affected by detector saturation. Rather than attempt to correct the measured data to remove saturation, we incorporate the detector behavior into the statistical basis of the analysis. This is performed within the framework of maximum a posteriori reconstruction. The proposed approach has several advantages over previous techniques. No approximations are involved other than the assumed model of the detector. The method performs well even when applied to highly saturated and/or single‐scan data sets. It is statistically rigorous, correctly treating the underlying statistical distribution of the data. It is also compatible with Bayesian methods for incorporating prior knowledge about sample properties. An efficient iterative scheme for solving the proposed equations is presented for the case of the bilinear model commonly used in analyses of SIMS data. The correctness of the approach and its efficacy are demonstrated on synthetic data sets. The method is found to perform better than a widely‐used data‐correction method used in combination with alternating‐least‐squares Multivariate Curve Resolution analysis. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

10.
We present a systematic study on the reliability of different theoretical methods to represent the molecular electrostatic potential (MEP), and MEP-derived properties of prototypical compounds containing phosphorus, sulfur and chlorine. Calculations at the Hartree-Fock and M?ller-Plesset up to fourth-order level of theory, as well as local, non-local and hybrid density functional computations were performed for a representative set of neutral molecules. The study was carried out using different basis sets ranging from the medium-sized 6-31G(d ) to the large 6-31G(2d,2p) basis set, but in some test calculations more extended basis sets were also considered. The analysis of the results was performed discussing separately the effect of the basis set and of the level of theory used to determine the molecular wavefunction on the reliability of the MEP and MEP-derived properties. Received: 4 March 1997 / Accepted: 27 June 1997  相似文献   

11.
发展了一种基于分子相互识别的蛋白质分类方法, 应用数据挖掘策略与统计学聚类, 根据辅酶A (coenzyme-A, CoA)结合蛋白的结合模式特征数据, 通过对比和分析多种分类方法对该体系的分类准确度, 对这类体内重要的蛋白进行了分类方法学研究, 选择了最优的两步聚类法. 本研究工作设计和建立了一个分类参数, 可以简洁有效地评价出各个结合特征的显著性与重要性, 并以此为依据从所有特征中筛选出决定性的特征变量. 研究结果所得到的CoA结合蛋白的三个分类, 都具有显著的氢键与疏水结合特征; CoA可以与多个生物活性关键氨基酸残基形成氢键作用. 这些相互作用的共性及分类上的差异, 说明了配体与不同受体相互作用过程中结合模式上的细微差别, 对于以CoA结合蛋白为靶点的选择性调控分子设计具有重要的参考意义与指导作用.  相似文献   

12.
Summary Display methods, such as principal component analysis, and clustering methods were applied to a sample of cholecystokinin, (sulfated CCK8) conformations obtained from a Monte Carlo simulation. It is shown that six families of conformations can entirely describe the sample. Each family represents a typical conformer. These theoretical models are in agreement with recent experimental results which stress the predominance of folded conformers in aqueous medium.  相似文献   

13.
In this paper, the performance of new clustering methods such as Neural Gas (NG) and Growing Neural Gas (GNG) is compared with the K-means method for real and simulated data sets. Moreover, a new algorithm called growing K-means, GK, is introduced as the alternative to Neural Gas and Growing Neural Gas. It has small input requirements and is conceptually very simple. The GK leads to nearly optimal values of the cost function, and, contrary to K-means, it is independent of the initial data set partition. The incremental property of GK additionally helps to estimate the number of "natural" clusters in data, i.e., the well-separated groups of objects in the data space.  相似文献   

14.
The (TiO2)n clusters and their anions for n = 1-4 have been studied with coupled cluster theory [CCSD(T)] and density functional theory (DFT). For n > 1, numerous conformations are located for both the neutral and anionic clusters, and their relative energies are calculated at both the DFT and CCSD(T) levels. The CCSD(T) energies are extrapolated to the complete basis set limit for the monomer and dimer and calculated up to the triple-zeta level for the trimer and tetramer. The adiabatic and vertical electron detachment energies of the anionic clusters to the ground and first excited states of the neutral clusters are calculated at both levels and compared with the experimental results. The comparison allows for the definitive assignment of the ground-state structures of the anionic clusters. Anions of the dimer and tetramer are found to have very closely lying conformations within 2 kcal/mol at the CCSD(T) level, whereas that of the trimer does not. In addition, accurate clustering energies and heats of formation are calculated for the neutral clusters and compared with the available experimental data. Estimates of the titanium-oxygen bond energies show that they are stronger than the group VIB transition metal-oxygen bonds except for tungsten. The atomization energies of these clusters display much stronger basis set dependence than the clustering energies. This allows the calculation of more accurate heats of formation for larger clusters on the basis of calculated clustering energies.  相似文献   

15.
Four different two-dimensional fingerprint types (MACCS, Unity, BCI, and Daylight) and nine methods of selecting optimal cluster levels from the output of a hierarchical clustering algorithm were evaluated for their ability to select clusters that represent chemical series present in some typical examples of chemical compound data sets. The methods were evaluated using a Ward's clustering algorithm on subsets of the publicly available National Cancer Institute HIV data set, as well as with compounds from our corporate data set. We make a number of observations and recommendations about the choice of fingerprint type and cluster level selection methods for use in this type of clustering  相似文献   

16.
A new approach for developing of basis sets to be used along with effective core potential is systematically studied. The behavior of the LCAO coefficients versus the ln(α) of the respective primitives can provide simple guidelines to establish the range over which the basis set should be developed or modified, especially when using effective core potential. Double-zeta basis sets were modeled for SBK pseudopotential from all-electron basis sets for a series of compounds containing elements of the second period of the periodic table. Application of the modeled basis sets at the Hartree–Fock and MP2 levels of theory shows that the new method provides molecular properties as accurate as those calculated by all-electron calculations. © 1997 John Wiley & Sons, Inc. J Comput Chem 18 : 1918–1929, 1997  相似文献   

17.
Two algorithms are introduced that show exceptional promise in finding molecular conformations using distance geometry on nuclear magnetic resonance data. The first algorithm is a gradient version of the majorization algorithm from multidimensional scaling. The main contribution is a large decrease in CPU time. The second algorithm is an iterative algorithm between possible conformations obtained from the first algorithm and permissible data points near the configuration. These ideas are similar to alternating least squares or alternating projections on convex sets. The iterations significantly improve the conformation from the first algorithm when applied to the small peptide E. coli STh enterotoxin. © 1993 John Wiley & Sons, Inc.  相似文献   

18.
Pharmacophore modeling of large, drug-like molecules, such as the dopamine reuptake inhibitor GBR 12909, is complicated by their flexibility. A comprehensive hierarchical clustering study of two GBR 12909 analogs was performed to identify representative conformers for input to three-dimensional quantitative structure–activity relationship studies of closely-related analogs. Two data sets of more than 700 conformers each produced by random search conformational analysis of a piperazine and a piperidine GBR 12909 analog were studied. Several clustering studies were carried out based on different feature sets that include the important pharmacophore elements. The distance maps, the plot of the effective number of clusters versus actual number of clusters, and the novel derived clustering statistic, percentage change in the effective number of clusters, were shown to be useful in determining the appropriate clustering level.Six clusters were chosen for each analog, each representing a different region of the torsional angle space that determines the relative orientation of the pharmacophore elements. Conformers of each cluster that are representative of these regions were identified and compared for each analog. This study illustrates the utility of using hierarchical clustering for the classification of conformers of highly flexible molecules in terms of the three-dimensional spatial orientation of key pharmacophore elements.  相似文献   

19.
20.
A complete vibrational analysis of the Fourier transform (FT) infrared (IR) and FT‐Raman spectra of both molecules was carried out using quantum chemical calculations. The structure of phenothiazine (PTZ) and N‐methylphenothiazine (N‐MePTZ) were studied by semiempirical, and ab initio methods. Different basis sets and two new procedures for scaling the frequencies of the ring modes were used. Vibrational data of the methyl group in N‐MePTZ were interpreted in terms of the different molecular conformations in the solid state. The 1H‐ and 13C–nuclear magnetic resonance (NMR) data were interpreted in terms of the electron densities on the atoms and the stacking solute–solute association in dimethyl sulfoxide solution. Chemical shifts were related to the Merz‐Kollman atomic charges. © 2002 Wiley Periodicals, Inc. Int J Quantum Chem, 2002  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号