首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Similarity-based methods for virtual screening are widely used. However, conventional searching using 2D chemical fingerprints or 2D graphs may retrieve only compounds which are structurally very similar to the original target molecule. Of particular current interest then is scaffold hopping, that is, the ability to identify molecules that belong to different chemical series but which could form the same interactions with a receptor. Reduced graphs provide summary representations of chemical structures and, therefore, offer the potential to retrieve compounds that are similar in terms of their gross features rather than at the atom-bond level. Using only a fingerprint representation of such graphs, we have previously shown that actives retrieved were more diverse than those found using Daylight fingerprints. Maximum common substructures give an intuitively reasonable view of the similarity between two molecules. However, their calculation using graph-matching techniques is too time-consuming for use in practical similarity searching in larger data sets. In this work, we exploit the low cardinality of the reduced graph in graph-based similarity searching. We reinterpret the reduced graph as a fully connected graph using the bond-distance information of the original graph. We describe searches, using both the maximum common induced subgraph and maximum common edge subgraph formulations, on the fully connected reduced graphs and compare the results with those obtained using both conventional chemical and reduced graph fingerprints. We show that graph matching using fully connected reduced graphs is an effective retrieval method and that the actives retrieved are likely to be topologically different from those retrieved using conventional 2D methods.  相似文献   

2.
We present an efficient method to cluster large chemical databases in a stepwise manner. Databases are first clustered with an extended exclusion sphere algorithm based on Tanimoto coefficients calculated from Daylight fingerprints. Substructures are then extracted from clusters by iterative application of a maximum common substructure algorithm. Clusters with common substructures are merged through a second application of an exclusion sphere algorithm. In a separate step, singletons are compared to cluster substructures and added to a cluster if similarity is sufficiently high. The method identifies tight clusters with conserved substructures and generates singletons only if structures are truly distinct from all other library members. The method has successfully been applied to identify the most frequently occurring scaffolds in databases, for the selection of analogues of screening hits and in the prioritization of chemical libraries offered by commercial vendors.  相似文献   

3.
A number of modeling and simulation algorithms using internal coordinates rely on hierarchical representations of molecular systems. Given the potentially complex topologies of molecular systems, though, automatically generating such hierarchical decompositions may be difficult. In this article, we present a fast general algorithm for the complete construction of a hierarchical representation of a molecular system. This two-step algorithm treats the input molecular system as a graph in which vertices represent atoms or pseudo-atoms, and edges represent covalent bonds. The first step contracts all cycles in the input graph. The second step builds an assembly tree from the reduced graph. We analyze the complexity of this algorithm and show that the first step is linear in the number of edges in the input graph, whereas the second one is linear in the number of edges in the graph without cycles, but dependent on the branching factor of the molecular graph. We demonstrate the performance of our algorithm on a set of specifically tailored difficult cases as well as on a large subset of molecular graphs extracted from the protein data bank. In particular, we experimentally show that both steps behave linearly in the number of edges in the input graph (the branching factor is fixed for the second step). Finally, we demonstrate an application of our hierarchy construction algorithm to adaptive torsion-angle molecular mechanics.  相似文献   

4.
Stereochemistry deals primarily with distinctions based on rigid geometry, e.g., bond angles and lengths. But some chemical species have molecular graphs (such as knots, catenanes, and nonplanar graphs K5 and K3.3) that reside in space in a topologically nontrivial way. For such molecules there is hope of using topological methods to gain chemical information. Viewing a molecular graph as a topological object in space makes it unrealistically flexible; but if one proves that a certain graph is “topologically chiral” or that two graphs are “topological diastereomers,” then one has ruled out interconversion under any physical conditions for which the molecular graph still makes sense. In this paper, we consider several kinds of topological questions one might ask about graphs in space, methology and results available, and specific topological properties of various molecules.  相似文献   

5.
A variety of computational models have been introduced recently that are based on the properties of DNA. In particular, branched junction molecules and graphlike DNA structures have been proposed as computational devices, although such models have yet to be confirmed experimentally. DNA branched junction molecules have been used previously to form graph-like three-dimensional DNA structures, such as a cube and a truncated octahedron, but these DNA constructs represent regular graphs, where the connectivities of all of the vertexes are the same. Here, we demonstrate the construction of an irregular DNA graph structure by a single step of self-assembly. A graph made of five vertexes and eight edges was chosen for this experiment. DNA branched junction molecules represent the vertexes, and duplex molecules represent the edges; in contrast to previous work, specific edge molecules are included as components. We demonstrate that the product is a closed cyclic single-stranded molecule that corresponds to a double cover of the graph and that the DNA double helix axes represent the designed graph. The correct assembly of the target molecule has been demonstrated unambiguously by restriction analysis.  相似文献   

6.
The generating function of the sequence counting the number of graph vertices at a given distance from the root is called the spherical growth function of the rooted graph. The vertices farthest from the root form an induced subgraph called the distance-residual graph. These mathematical notions are applied to benzenoid graphs which are used in graph theory to represent benzenoid hydrocarbons. An algorithm for calculating the growth in catacondensed benzenoids is presented, followed by some examples.  相似文献   

7.
A computer code and nonnumerical algorithm are developed to construct the edge group of a graph and to enumerate the edge colorings of graphs of chemical interest. The edge colorings of graphs have many applications in nuclear magnetic resonance (NMR), multiple quantum NMR, enumeration of structural isomers of unsaturated organic compounds, and in the construction of configurational integral expansion series in statistical mechanics. The code developed is applied to many NMR graphs, complete graphs containing up to 10 vertices, and the Petersen graph.  相似文献   

8.
Recently a method (RASCAL) for determining graph similarity using a maximum common edge subgraph algorithm has been proposed which has proven to be very efficient when used to calculate the relative similarity of chemical structures represented as graphs. This paper describes heuristics which simplify a RASCAL similarity calculation by taking advantage of certain properties specific to chemical graph representations of molecular structure. These heuristics are shown experimentally to increase the efficiency of the algorithm, especially at more distant values of chemical graph similarity.  相似文献   

9.
Multilayered cyclic fence graphs (MLCFG, E(m,n), F(m,n), D(m,n), G(m,n), X(m,n)) are proposed to be defined, all of which are composed of m 2n-membered cycles with periodic bridging. They are also cubic and bipartite. Hamiltonian wheel graph, H (n,[j(k)]), and parallelogram-shaped polyhex graph are also defined. All the members of MLCFGs are found to be isomorphic to the so-called "torus benzenoid graphs", while some members of MLCFGs are found to be related to the Hamilton wheel graphs. Through the construction of Hamilton wheel graph and the matrix representation by Kirby, a number of isomorphic relations among MLCFGs, Hamilton wheel graphs, and polyhex graphs were obtained. These relations among the MLCFG members were found also by the help of the characteristic quantities of MLCFGs.  相似文献   

10.
Several efficient correspondence graph-based algorithms for determining the maximum common substructure (MCS) of a pair of molecules have been published in the literature. The extension of the problem to three or more molecules is however nontrivial; heuristics used to increase the efficiency in the two-molecule case are either inapplicable to the many-molecule case or do not provide significant speedups. Our specific algorithmic contribution is two-fold. First, we show how the correspondence graph approach for the two-molecule case can be generalized to obtain an algorithm that is guaranteed to find the optimum connected MCS of multiple molecules, and that runs fast on most families of molecules using a new divide-and-conquer strategy that has hitherto not been reported in this context. Second, we provide a characterization of those compound families for which the algorithm might run slowly, along with a heuristic for speeding up computations on these families. We also extend the above algorithm to a heuristic algorithm to find the disconnected MCS of multiple molecules and to an algorithm for clustering molecules into groups, with each group sharing a substantial MCS. Our methods are flexible in that they provide exquisite control on various matching criteria used to define a common substructure.  相似文献   

11.
Chemical stereographs are presented as vehicles for representing qualitative three-dimensional features of molecules that put stereochemical and conformational distinctions in a common graph-theoretic formalism. They extend the concept of a chemical graph by adding tetrads, each qualitatively characterizing the three-dimensional arrangement of four atoms with respect to its clinicity and handedness components. The characterization is sufficiently precise to distinguish synperiplanar, synclinical (gauche), anticlinal and antiperiplanar relationships between vicinal atoms of various conformers. Collectively, the tetrads constitute the embedding graph which presents new possibilities in displaying the stereochemical and conformational features of a molecule. A chemical graph and one of its possible embedding graphs constitute a chemical stereograph. Potential applications of chemical stereographs in the areas of structural representations, molecular symmetry analysis, and stereo-specific substructure searching are discussed.Part of this work was presented by M.J. at the 1989 PaciChem Mathematical Chemistry Minisymposium. The work was funded by The Upjohn Company.  相似文献   

12.
For acyclic systems the center of a graph has been known to be either a single vertex of two adjacent vertices, that is, an edge. It has not been quite clear how to extend the concept of graph center to polycyclic systems. Several approaches to the graph center of molecular graphs of polycyclic graphs have been proposed in the literature. In most cases alternative approaches, however, while being apparently equally plausible, gave the same results for many molecules, but occasionally they differ in their characterization of molecular center. In order to reduce the number of vertices that would qualify as forming the center of the graph, a hierarchy of rules have been considered in the search for graph centers. We reconsidered the problem of “the center of a graph” by using a novel concept of graph theory, the vertex “weights,” defined by counting the number of pairs of vertices at the same distance from the vertex considered. This approach gives often the same results for graph centers of acyclic graphs as the standard definition of graph center based on vertex eccentricities. However, in some cases when two nonequivalent vertices have been found as graph center, the novel approach can discriminate between the two. The same approach applies to cyclic graphs without additional rules to locate the vertex or vertices forming the center of polycyclic graphs, vertices referred to as central vertices of a graph. In addition, the novel vertex “weights,” in the case of acyclic, cyclic, and polycyclic graphs can be interpreted as vertex centralities, a measure for how close or distant vertices are from the center or central vertices of the graph. Besides illustrating the centralities of a number of smaller polycyclic graphs, we also report on several acyclic graphs showing the same centrality values of their vertices. © 2013 Wiley Periodicals, Inc.  相似文献   

13.
The concept of reaction route (RR) graphs introduced recently by us for kinetic mechanisms that produce minimal graphs is extended to the problem of non-minimal kinetic mechanisms for the case of a single overall reaction (OR). A RR graph is said to be minimal if all of the stoichiometric numbers in all direct RRs of the mechanism are equal to +/-1 and non-minimal if at least one stoichiometric number in a direct RR is non-unity, e.g., equal to +/-2. For a given mechanism, four unique topological characteristics of RR graphs are defined and enumerated, namely, direct full routes (FRs), empty routes (ERs), intermediate nodes (INs), and terminal nodes (TNs). These are further utilized to construct the RR graphs. One algorithm involves viewing each IN as a central node in a RR sub-graph. As a result, the construction and enumeration of RR graphs are reduced to the problem of balancing the peripheral nodes in the RR sub-graphs according to the list of FRs, ERs, INs, and TNs. An alternate method involves using an independent set of RRs to draw the RR graph while satisfying the INs and TNs. Three examples are presented to illustrate the application of non-minimal RR graph theory.  相似文献   

14.
A fullerene graph is a planar cubic graph whose all faces are pentagonal and hexagonal. The structure of cyclic edge-cuts of fullerene graphs of sizes at most 6 is known. In the paper we study cyclic 7-edge connectivity of fullerene graphs, distinguishing between degenerate and non-degenerate cyclic edge-cuts, regarding the arrangement of the 12 pentagons. We prove that if there exists a non-degenerate cyclic 7-edge-cut in a fullerene graph, then the graph is a nanotube unless it is one of the two exceptions presented. We determined that there are 57 configurations of degenerate cyclic 7-edge-cuts, and we listed all of them.  相似文献   

15.
A new approach is presented for identifying all possible cycles in graphs. Input data are the total numbers of vertices and edges, as well as the vertex adjacencies using arbitrary vertex numbering. A homeomorphically reduced graph (HRG) is constructed by ignoring vertices of degree less than three. The algorithm is based on successive generation of possible edge-combinations in the HRG. If a combination yields a cycle, it is either printed or stored and then finally printed in a list of all possible cycles arranged in the order of increasing ring size. A unique numbering of the cycle is used. The computer program is listed and exemplified. Computing times are given.  相似文献   

16.
As several structural proteomic projects are producing an increasing number of protein structures with unknown function, methods that can reliably predict protein functions from protein structures are in urgent need. In this paper, we present a method to explore the clustering patterns of amino acids on the 3-dimensional space for protein function prediction. First, amino acid residues on a protein structure are clustered into spatial groups using hierarchical agglomerative clustering, based on the distance between them. Second, the protein structure is represented using a graph, where each node denotes a cluster of amino acids. The nodes are labeled with an evolutionary profile derived from the multiple alignment of homologous sequences. Then, a shortest-path graph kernel is used to calculate similarities between the graphs. Finally, a support vector machine using this graph kernel is used to train classifiers for protein function prediction. We applied the proposed method to two separate problems, namely, prediction of enzymes and prediction of DNA-binding proteins. In both cases, the results showed that the proposed method outperformed other state-of-the-art methods.  相似文献   

17.
18.
Reduced graphs provide summary representations of chemical structures. Here, a variety of different types of reduced graphs are compared in similarity searches. The reduced graphs are found to give comparable performance to Daylight fingerprints in terms of the number of active compounds retrieved. However, no one type of reduced graph is found to be consistently superior across a variety of different data sets. Consequently, a representative set of reduced graphs was chosen and used together with Daylight fingerprints in data fusion experiments. The results show improved performance in 10 out of 11 data sets compared to using Daylight fingerprints alone. Finally, the potential of using reduced graphs to build SAR models is demonstrated using recursive partitioning. An SAR model consistent with a published model is found following just two splits in the decision tree.  相似文献   

19.
Similarity searching using reduced graphs   总被引:3,自引:0,他引:3  
Reduced graphs provide summary representations of chemical structures. In this work, the effectiveness of reduced graphs for similarity searching is investigated. Different types of reduced graphs are introduced that aim to summarize features of structures that have the potential to form interactions with receptors while retaining the topology between the features. Similarity searches have been carried out across a variety of different activity classes. The effectiveness of the reduced graphs at retrieving compounds with the same activity as known target compounds is compared with searching using Daylight fingerprints. The reduced graphs are shown to be effective for similarity searching and to retrieve more diverse active compounds than those found using Daylight fingerprints; they thus represent a complementary similarity searching tool.  相似文献   

20.
Whereas the potential symmetry of a molecule may be a feature of importance in synthesis design, this one is often difficult to detect visually in the structural formula. In the present article, we describe an efficient algorithm for the perception of this molecular property. We have addressed this problem in terms of graph theory and defined it as the Maximum Symmetrical Split of a molecular graph. A solution is obtained by deleting in such a graph a minimum number of edges and vertices so that the resulting subgraph consists of exactly two isomorphic connected components that correspond to a pair of synthetically equivalent synthons. In view to reduce the search space of the problem, we have based our algorithm on CSP techniques. In this study, we have found that the maximum symmetrical split is an original kind of Constraint Satisfaction Problem. The algorithm has been implemented into the RESYN_Assistant system, and its performance has been tested on a set of varied molecules which were the targets of previously published synthetic studies. The results show that potential symmetry is perceived quickly and efficiently by the program. The graphical display of this perception information may help a chemist to design reflexive or highly convergent syntheses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号