首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 750 毫秒
1.
Cellular functions result from intricate networks of molecular interactions, which involve not only proteins and nucleic acids but also small chemical compounds. Here we present an efficient algorithm for comparing two chemical structures of compounds, where the chemical structure is treated as a graph consisting of atoms as nodes and covalent bonds as edges. On the basis of the concept of functional groups, 68 atom types (node types) are defined for carbon, nitrogen, oxygen, and other atomic species with different environments, which has enabled detection of biochemically meaningful features. Maximal common subgraphs of two graphs can be found by searching for maximal cliques in the association graph, and we have introduced heuristics to accelerate the clique finding and to detect optimal local matches (simply connected common subgraphs). Our procedure was applied to the comparison and clustering of 9383 compounds, mostly metabolic compounds, in the KEGG/LIGAND database. The largest clusters of similar compounds were related to carbohydrates, and the clusters corresponded well to the categorization of pathways as represented by the KEGG pathway map numbers. When each pathway map was examined in more detail, finer clusters could be identified corresponding to subpathways or pathway modules containing continuous sets of reaction steps. Furthermore, it was found that the pathway modules identified by similar compound structures sometimes overlap with the pathway modules identified by genomic contexts, namely, by operon structures of enzyme genes.  相似文献   

2.
We present a graph-theoretic approach to adaptively compute many-body approximations in an efficient manner to perform (a) accurate post-Hartree–Fock (HF) ab initio molecular dynamics (AIMD) at density functional theory (DFT) cost for medium- to large-sized molecular clusters, (b) hybrid DFT electronic structure calculations for condensed-phase simulations at the cost of pure density functionals, (c) reduced-cost on-the-fly basis extrapolation for gas-phase AIMD and condensed phase studies, and (d) accurate post-HF-level potential energy surfaces at DFT cost for quantum nuclear effects. The salient features of our approach are ONIOM-like in that (a) the full system (cluster or condensed phase) calculation is performed at a lower level of theory (pure DFT for condensed phase or hybrid DFT for molecular systems), and (b) this approximation is improved through a correction term that captures all many-body interactions up to any given order within a higher level of theory (hybrid DFT for condensed phase; CCSD or MP2 for cluster), combined through graph-theoretic methods. Specifically, a region of chemical interest is coarse-grained into a set of nodes and these nodes are then connected to form edges based on a given definition of local envelope (or threshold) of interactions. The nodes and edges together define a graph, which forms the basis for developing the many-body expansion. The methods are demonstrated through (a) ab initio dynamics studies on protonated water clusters and polypeptide fragments, (b) potential energy surface calculations on one-dimensional water chains such as those found in ion channels, and (c) conformational stabilization and lattice energy studies on homogeneous and heterogeneous surfaces of water with organic adsorbates using two-dimensional periodic boundary conditions.  相似文献   

3.
4.
Soil is predicted to contain thousands of unique bacterial species per gram. Soil DNA libraries represent large reservoirs of biosynthetic diversity from which diverse secondary metabolite gene clusters can be recovered and studied. The screening of an archived soil DNA library using primers designed to target oxytryptophan dimerization genes allowed us to identify and functionally characterize the first indolotryptoline biosynthetic gene cluster. The recovery and heterologous expression of an environmental DNA-derived gene cluster encoding the biosynthesis of the antitumor substance BE-54017 is reported here. Transposon mutagenesis identified two monooxygenases, AbeX1 and AbeX2, as being responsible for the transformation of an indolocarbazole precursor into the indolotryptoline core of BE-54017.  相似文献   

5.
6.
DNA arrays have become the immediate choice in the analysis of large-scale expression measurements. Understanding the expression pattern of genes provide functional information on newly identified genes by computational approaches. Gene expression pattern is an indicator of the state of the cell, and abnormal cellular states can be inferred by comparing expression profiles. Since co-regulated genes, and genes involved in a particular pathway, tend to show similar expression patterns, clustering expression patterns has become the natural method of choice to differentiate groups. However, most methods based on cluster analysis suffer from the usual problems (i) dead units, and (ii) the problem of determining the correct number of clusters (k) needed to classify the data. Selecting the k has been an open problem of pattern recognition and statistics for decades. Since clustering reveals similar patterns present in the data, fixing this number strongly influences the quality of the result. While there is no theoretical solution to this problem, the number of clusters can be decided by a heuristic clustering algorithm called rival penalized competitive learning (RPCL). We present a novel implementation of RPCL that transforms the correct number of clusters problem to the tractable problem of clustering based on the degree of similarity. This is biologically significant since our implementation clusters functionally co-regulated genes and genes that present similar patterns of expression. This new approach reveals potential genes that are co-involved in a biological process. This implementation of the RPCL algorithm is useful in differentiating groups involved in concerted functional regulation and helps to progressively home into patterns, which are closely similar.  相似文献   

7.
A number of modeling and simulation algorithms using internal coordinates rely on hierarchical representations of molecular systems. Given the potentially complex topologies of molecular systems, though, automatically generating such hierarchical decompositions may be difficult. In this article, we present a fast general algorithm for the complete construction of a hierarchical representation of a molecular system. This two-step algorithm treats the input molecular system as a graph in which vertices represent atoms or pseudo-atoms, and edges represent covalent bonds. The first step contracts all cycles in the input graph. The second step builds an assembly tree from the reduced graph. We analyze the complexity of this algorithm and show that the first step is linear in the number of edges in the input graph, whereas the second one is linear in the number of edges in the graph without cycles, but dependent on the branching factor of the molecular graph. We demonstrate the performance of our algorithm on a set of specifically tailored difficult cases as well as on a large subset of molecular graphs extracted from the protein data bank. In particular, we experimentally show that both steps behave linearly in the number of edges in the input graph (the branching factor is fixed for the second step). Finally, we demonstrate an application of our hierarchy construction algorithm to adaptive torsion-angle molecular mechanics.  相似文献   

8.
Clustering analysis of data from DNA microarray hybridization studies is an essential task for identifying biologically relevant groups of genes. Attribute cluster algorithm (ACA) has provided an attractive way to group and select meaningful genes. However, ACA needs much prior knowledge about the genes to set the number of clusters. In practical applications, if the number of clusters is misspecified, the performance of the ACA will deteriorate rapidly. We propose the Cooperative Competition Cluster Algorithm (CCCA) in this paper. In the algorithm, we assume that both cooperation and competition exist simultaneously between clusters in the process of clustering. By using this principle of Cooperative Competition, the number of clusters can be found in the process of clustering. Experimental results on a synthetic and gene expression data are demonstrated. The results show that CCCA can choose the number of clusters automatically and get excellent performance with respect to other competing methods.  相似文献   

9.
A new radial space-filling method for visualizing cluster hierarchies is presented. The method, referred to as a radial clustergram, arranges the clusters into a series of layers, each representing a different level of the tree. It uses adjacency of nodes instead of links to represent parent-child relationships and allocates sufficient screen real estate to each node to allow effective visualization of cluster properties through color-coding. Radial clustergrams combine the most appealing features of other cluster visualization techniques but avoid their pitfalls. Compared to classical dendrograms and hyperbolic trees, they make much more efficient use of space; compared to treemaps, they are more effective in conveying hierarchical structure and displaying properties of nodes higher in the tree. A fisheye lens is used to focus on areas of interest, without losing sight of the global context. The utility of the method is demonstrated using examples from the fields of molecular diversity and conformational analysis.  相似文献   

10.
11.
Gene dependency networks often undergo changes in response to different conditions. Understanding how these networks change across two conditions is an important task in genomics research. Most previous differential network analysis approaches assume that the difference between two condition-specific networks is driven by individual edges. Thus, they may fail in detecting key players which might represent important genes whose mutations drive the change of network. In this work, we develop a node-based differential network analysis (N-DNA) model to directly estimate the differential network that is driven by certain hub nodes. We model each condition-specific gene network as a precision matrix and the differential network as the difference between two precision matrices. Then we formulate a convex optimization problem to infer the differential network by combing a D-trace loss function and a row-column overlap norm penalty function. Simulation studies demonstrate that N-DNA provides more accurate estimate of the differential network than previous competing approaches. We apply N-DNA to ovarian cancer and breast cancer gene expression data. The model rediscovers known cancer-related genes and contains interesting predictions.  相似文献   

12.
From proposed mechanisms for framework reorganizations of the carboranes C2B n-2H n ,n = 5–12, we present reaction graphs in which points or vertices represent individual carborane isomers, while edges or arcs correspond to the various intramolecular rearrangement processes that carry the pair of carbon heteroatoms to different positions within the same polyhedral form. Because they contain both loops and multiple edges, these graphs are actually pseudographs. Loops and multiple edges have chemical significance in several cases. Enantiomeric pairs occur among carborane isomers and among the transition state structures on pathways linking the isomers. For a carborane polyhedral structure withn vertices, each graph hasn(n -1)/2 graph edges. The degree of each graph vertex and the sum of degrees of all graph vertices are independent of the details of the isomerization mechanism. The degree of each vertex is equal to twice the number of rotationally equivalent forms of the corresponding isomer. The total of all vertex degrees is just twice the number of edges orn(n - 1). The degree of each graph vertex is related to the symmetry point group of the structure of the corresponding isomer. Enantiomeric isomer pairs are usually connected in the graph by a single edge and never by more than two edges.  相似文献   

13.
14.
The properties of isolated AlCl3 clusters and the bulk system are investigated by means of static and dynamic electronic structure methods. We find important structural motifs with the edge connectivity dominant in a dimer and the corner connectivity dominant in a trimer. Furthermore, the trimer cluster exhibits an interesting ring structure with large cooperative effects relative to the dimer. Comparing the found structural motifs in isolated molecule calculations with the structure of the liquid allows us to determine the dominance of edge connectivity in the liquid. The size of the clusters present in the liquid indicates indeed that the dimer is the most abundant species, but there are also trimers, tetramers, and pentamers present. From the local dipole analysis both for the isolated clusters as well as for the liquid, further proof for the edge connectivity is given. However, all results point to the fact that there is also some small percentage of corner connectivity present that might be attributed to the most stable corner-connected cluster, namely the trimer. Importantly, we find that energetic considerations of isolated (static) clusters only do not represent the findings in liquid phase. Instead, a quantum cluster equilibrium approach or simulations are needed.  相似文献   

15.
The development of new antibacterial drugs has become one of the most important tasks of the century in order to overcome the posing threat of drug resistance in pathogenic bacteria. Many antibiotics originate from natural products produced by various microorganisms. Over the last decades, bioinformatical approaches have facilitated the discovery and characterization of these small compounds using genome mining methodologies. A key part of this process is the identification of the most promising biosynthetic gene clusters (BGCs), which encode novel natural products. In 2017, the Antibiotic Resistant Target Seeker (ARTS) was developed in order to enable an automated target-directed genome mining approach. ARTS identifies possible resistant target genes within antibiotic gene clusters, in order to detect promising BGCs encoding antibiotics with novel modes of action. Although ARTS can predict promising targets based on multiple criteria, it provides little information about the cluster structures of possible resistant genes. Here, we present SYN-view. Based on a phylogenetic approach, SYN-view allows for easy comparison of gene clusters of interest and distinguishing genes with regular housekeeping functions from genes functioning as antibiotic resistant targets. Our aim is to implement our proposed method into the ARTS web-server, further improving the target-directed genome mining strategy of the ARTS pipeline.  相似文献   

16.
Jahn–Teller and Berry pseudorotations in transition metal and main group clusters such as Hf5, Ta5, W5 and Bi5 are interesting because of the competition between relativistic effects and pseudorotations. Topological representations of various isomerization pathways arising from the Berry pseudorotation of pentamers constitute the edges of the Desargues–Levi graph. We have computed the combinatorics for multinomial colorings of the vertices, edges and 10-faces of the Desargues–Levi isomerization graph for all irreducible representations and the nuclear spin statistics of spin-7/2 181Ta5 as well as the TBP composite cluster particles. Topological insights into Jahn–Teller and Berry pseudorotations and relativistic effects are provided.  相似文献   

17.
A variety of computational models have been introduced recently that are based on the properties of DNA. In particular, branched junction molecules and graphlike DNA structures have been proposed as computational devices, although such models have yet to be confirmed experimentally. DNA branched junction molecules have been used previously to form graph-like three-dimensional DNA structures, such as a cube and a truncated octahedron, but these DNA constructs represent regular graphs, where the connectivities of all of the vertexes are the same. Here, we demonstrate the construction of an irregular DNA graph structure by a single step of self-assembly. A graph made of five vertexes and eight edges was chosen for this experiment. DNA branched junction molecules represent the vertexes, and duplex molecules represent the edges; in contrast to previous work, specific edge molecules are included as components. We demonstrate that the product is a closed cyclic single-stranded molecule that corresponds to a double cover of the graph and that the DNA double helix axes represent the designed graph. The correct assembly of the target molecule has been demonstrated unambiguously by restriction analysis.  相似文献   

18.
The measurement of the degree of symmetry proved to be a useful tool in the prediction of quantitative structural–physical correlations. These measurements have been based, in the most general form, on the folding/unfolding algorithm, for which we provide here a new and simpler proof. We generalize this proof to the case of objects composed of more than one (full) orbit. An important practical issue we consider is the division of the graph into symmetry orbits and the mapping of the symmetry group elements onto the points of the graph. The logical constraints imposed by the edges of the graph are reviewed and used for the successful resolution of the coupling between different orbits.  相似文献   

19.
This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.  相似文献   

20.
This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 µM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号