首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
A spectral clustering method is presented and applied to two-dimensional molecular structures, where it has been found particularly useful in the analysis of screening data. The method provides a means to quantify (1) the degree of intermolecular similarity within a cluster and (2) the contribution that the features of a molecule make to a cluster. In an application of the spectral clustering method to an example data set of 125 COX-2 inhibitors, these two criteria were used to place the molecules into clusters of chemically related two-dimensional structures.  相似文献   

2.
Summary Three-dimensional (3D)-database searches are now being widely applied to determine potential new active molecules. Many structural data sets obtained as a result of these searches are still large in size. In this paper we apply molecular similarity calculations as a rapid method to screen two such data sets. In the first investigation, synthetic candidates, produced as a result of a tendamistat -turn mimic search, were tested for their ability to imitate the -turn backbone. In the second study, structures extracted through a histamine pharmacophore query search were examined on the basis of their electronic similarity to histamine. Molecular similarity is shown to provide a rapid means of gaining insight into the composition of molecular data sets, with possible implications for future full 3D-database searches.  相似文献   

3.
Producing good low‐dimensional representations of high‐dimensional data is a common and important task in many data mining applications. Two methods that have been particularly useful in this regard are multidimensional scaling and nonlinear mapping. These methods attempt to visualize a set of objects described by means of a dissimilarity or distance matrix on a low‐dimensional display plane in a way that preserves the proximities of the objects to whatever extent is possible. Unfortunately, most known algorithms are of quadratic order, and their use has been limited to relatively small data sets. We recently demonstrated that nonlinear maps derived from a small random sample of a large data set exhibit the same structure and characteristics as that of the entire collection, and that this structure can be easily extracted by a neural network, making possible the scaling of data set orders of magnitude larger than those accessible with conventional methodologies. Here, we present a variant of this algorithm based on local learning. The method employs a fuzzy clustering methodology to partition the data space into a set of Voronoi polyhedra, and uses a separate neural network to perform the nonlinear mapping within each cell. We find that this local approach offers a number of advantages, and produces maps that are virtually indistinguishable from those derived with conventional algorithms. These advantages are discussed using examples from the fields of combinatorial chemistry and optical character recognition. © 2001 John Wiley & Sons, Inc. J Comput Chem 22: 373–386, 2001  相似文献   

4.
5.
γ‐Secretase inhibitors have been explored for the prevention and treatment of Alzheimer's disease (AD). Methods for prediction and screening of γ‐secretase inhibitors are highly desired for facilitating the design of novel therapeutic agents against AD, especially when incomplete knowledge about the mechanism and three‐dimensional structure of γ‐secretase. We explored two machine learning methods, support vector machine (SVM) and random forest (RF), to develop models for predicting γ‐secretase inhibitors of diverse structures. Quantitative analysis of the receiver operating characteristic (ROC) curve was performed to further examine and optimize the models. Especially, the Youden index (YI) was initially introduced into the ROC curve of RF so as to obtain an optimal threshold of probability for prediction. The developed models were validated by an external testing set with the prediction accuracies of SVM and RF 96.48 and 98.83% for γ‐secretase inhibitors and 98.18 and 99.27% for noninhibitors, respectively. The different feature selection methods were used to extract the physicochemical features most relevant to γ‐secretase inhibition. To the best of our knowledge, the RF model developed in this work is the first model with a broad applicability domain, based on which the virtual screening of γ‐secretase inhibitors against the ZINC database was performed, resulting in 368 potential hit candidates. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010  相似文献   

6.
7.
A fuzzy c-means clustering algorithm is presented which is much faster than the traditional algorithm for data sets in which the number of features is significantly larger than the number of feature vectors. The algorithm is constructed by utilizing the covariance structure of feature vectors and cluster centers. By using results from a previous clustering, modified versions of the new algorithm achieve additional reductions in floating point operations. © 1995 by John Wiley & Sons, Inc.  相似文献   

8.
9.
10.
In quantum chemistry one needs expansions of Orbitals and operators, defined with respect to one origin, about another origin. Because there is no straightforward method of obtaining such expansions, it is helpful to interpret them as translations of fields. The connection between translations and rotations of fields with the transformations of functions is considered. Of special physical interest are expansions in spherical harmonics, which have the form of an addition theorem. General properties of such expansions and possible methods to derive them are discussed.  相似文献   

11.
12.
13.
14.
15.
Spectra-structure relationships were investigated for estimating the anomeric configuration, residues and type of linkages of linear and branched trisaccharides using 13C-NMR chemical shifts. For this study, 119 pyranosyl trisaccharides were used that are trimers of the α or β anomers of D-glucose, D-galactose, D-mannose, L-fucose or L-rhamnose residues bonded through a or b glycosidic linkages of types 1→2, 1→3, 1→4, or 1→6, as well as methoxylated and/or N-acetylated amino trisaccharides. Machine learning experiments were performed for: (1) classification of the anomeric configuration of the first unit, second unit and reducing end; (2) classification of the type of first and second linkages; (3) classification of the three residues: reducing end, middle and first residue; and (4) classification of the chain type. Our previously model for predicting the structure of disaccharides was incorporated in this new model with an improvement of the predictive power. The best results were achieved using Random Forests with 204 di- and trisaccharides for the training set-it could correctly classify 83%, 90%, 88%, 85%, 85%, 75%, 79%, 68% and 94% of the test set (69 compounds) for the nine tasks, respectively, on the basis of unassigned chemical shifts.  相似文献   

16.
17.
We present a detailed comparison of computational efficiency and precision for several free energy difference (DeltaF) methods. The analysis includes both equilibrium and nonequilibrium approaches, and distinguishes between unidirectional and bidirectional methodologies. We are primarily interested in comparing two recently proposed approaches, adaptive integration, and single-ensemble path sampling to more established methodologies. As test cases, we study relative solvation free energies of large changes to the size or charge of a Lennard-Jones particle in explicit water. The results show that, for the systems used in this study, both adaptive integration and path sampling offer unique advantages over the more traditional approaches. Specifically, adaptive integration is found to provide very precise long-simulation DeltaF estimates as compared to other methods used in this report, while also offering rapid estimation of DeltaF. The results demonstrate that the adaptive integration approach is the best overall method for the systems studied here. The single-ensemble path sampling approach is found to be superior to ordinary Jarzynski averaging for the unidirectional, "fast-growth" nonequilibrium case. Closer examination of the path sampling approach on a two-dimensional system suggests it may be the overall method of choice when conformational sampling barriers are high. However, it appears that the free energy landscapes for the systems used in this study have rather modest configurational sampling barriers.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号