首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The masses of ions observed in the mass spectrum of a pure compound are correlated with the masses of the molecular substructures of the compound. Three methods are described for generating molecular substructures. Each method is evaluated to establish how effectively it generates the molecular substructures and correlates the masses of the molecular substructures with the masses of the observed fragment ions. Rules for mass-spectral fragmentation processes are incorporated into the mass spectral analysis software and illustrated for retro-aldol and lactone-ester reactions occurring in the thermospray mass spectra of oligomycin antibiotics.  相似文献   

2.
A system for structure elucidation based on proton NMR spectra has been developed. The system, named Spec2D (system for spectra from 2D-NMR), incorporates 1H NMR and H-H correlation spectroscopy (COSY) spectral information obtained from 2D-NMR experiments. 2D-NMR is important for the structure elucidation because it provides information about the relationships among differently situated protons in the structures of unknown compounds. The system uses the concepts of molecular graphs. The improved representation of substructures as well as several novel algorithms for structure generation have been devised to solve the combinatorial problem and to reduce the processing time. Spec2D consists of a knowledge base, an analysis module, and a candidate structure generator module. Spec2D proposes candidate structures from only 1H NMR and H-H COSY spectral information of an unknown compound without any 13C NMR spectral or structural information, such as molecular formulas. Spec2D has the capability to propose the "new" structure of an unknown compound, if the corresponding substructures are included in the knowledge base.  相似文献   

3.
A novel system of substructure codes has been developed to characterize the spherical environment of single atoms and complete ring systems. The codes are generated automatically from topologically represented chemical structures and serve to describe structural entities corresponding to spectral parameters uniquely. Their hierarchical order permits desired substructures and the corresponding chemical shifts to be sought in inverted files generated from a larger data base, thereby facilitating the estimation of unknown spectra.  相似文献   

4.
The problem of finding all nonisomorphic subgraphs of a given graph (all distinct substructures of a given molecular structure) is discussed. A computer program is introduced that first generates all connected subgraphs and then uses a combination of well-discriminating graph invariants to eliminate duplicates. The program is broadly applicable, in particular for molecular graphs which may or may not contain unsaturation or heteroatoms. The number of distinct substructures (Ns), proposed earlier as a measure of a compound's complexity which takes into account its symmetry, is thus automatically obtained. As was to be expected, due to the nature of the problem the computational effort increases exponentially with problem size, whence in most cases complexity measures other than Ns are to be preferred.  相似文献   

5.
A substructure isomorphism matrix n x p contains binary elements describing which of the given p query structures (substructures) are part of the given n target structures (molecular structures). Such a matrix can be used to investigate the diversity of the target structures and allows the characterization and comparison of structural libraries. A quadratic substructure isomorphism matrix n x n is obtained if the same structures are used as molecular structures and as substructures; this matrix contains full information about the topological hierarchy of the n structures. A hierarchical arrangement of chemical structures is useful for the evaluation of results obtained from searches in structure databases.  相似文献   

6.
Substructure mining algorithms are important drug discovery tools since they can find substructures that affect physicochemical and biological properties. Current methods, however, only consider a part of all chemical information that is present within a data set of compounds. Therefore, the overall aim of our study was to enable more exhaustive data mining by designing methods that detect all substructures of any size, shape, and level of chemical detail. A means of chemical representation was developed that uses atomic hierarchies, thus enabling substructure mining to consider general and/or highly specific features. As a proof-of-concept, the efficient, multipurpose graph mining system Gaston learned substructures of any size and shape from a mutagenicity data set that was represented in this manner. From these substructures, we extracted a set of only six nonredundant, discriminative substructures that represent relevant biochemical knowledge. Our results demonstrate the individual and synergistic importance of elaborate chemical representation and mining for nonlinear substructures. We conclude that the combination of elaborate chemical representation and Gaston provides an excellent method for 2D substructure mining as this recipe systematically explores all substructures in different levels of chemical detail.  相似文献   

7.
Mass spectral classifiers of 16 substructures that are present in basic structures of pesticides have been investigated to assist pesticide residues analysis as well as screening of pesticide lead compounds. Mass spectral data are first transformed into 396 features, and then Genetic Algorithm-Partial Least Squares (GA-PLS) as a feature selection method and Support Vector Machine (SVM) as a validation method are implemented together to get an optimization feature set for each substructure. At last, a statistical method which is AdaBoost algorithm combined with Classification and Regression Tree (AdaBoost-CART) is trained to predict the 16 substructures presence/absence using the optimization mass spectral feature set. It is demonstrated that the optimum feature sets can be used to predict the 16 pesticide substructures presence/absence with mostly 85-100% in recognition success rate instead of the original 396 features.  相似文献   

8.
We have developed a method, given a database of molecules and associated activities, to identify molecular substructures that are associated with many different biological activities. These may be therapeutic areas (e.g. antihypertensive) and/or mechanism-based activities (e.g. renin inhibitor). This information helps us avoid chemical classes that are likely to have unanticipated side effects and also can suggest combinatorial libraries that might have activity on a variety of receptor targets. The method was applied to the USPDI and MDDR databases. There are clearly substructures in each database that occur in many compounds and span a variety of therapeutic categories. Some of these are expected, but some are not.  相似文献   

9.
10.
11.
Drug–drug interactions (DDIs) can trigger unexpected pharmacological effects on the body, and the causal mechanisms are often unknown. Graph neural networks (GNNs) have been developed to better understand DDIs. However, identifying key substructures that contribute most to the DDI prediction is a challenge for GNNs. In this study, we presented a substructure-aware graph neural network, a message passing neural network equipped with a novel substructure attention mechanism and a substructure–substructure interaction module (SSIM) for DDI prediction (SA-DDI). Specifically, the substructure attention was designed to capture size- and shape-adaptive substructures based on the chemical intuition that the sizes and shapes are often irregular for functional groups in molecules. DDIs are fundamentally caused by chemical substructure interactions. Thus, the SSIM was used to model the substructure–substructure interactions by highlighting important substructures while de-emphasizing the minor ones for DDI prediction. We evaluated our approach in two real-world datasets and compared the proposed method with the state-of-the-art DDI prediction models. The SA-DDI surpassed other approaches on the two datasets. Moreover, the visual interpretation results showed that the SA-DDI was sensitive to the structure information of drugs and was able to detect the key substructures for DDIs. These advantages demonstrated that the proposed method improved the generalization and interpretation capability of DDI prediction modeling.

SA-DDI is designed to learn size-adaptive molecular substructures for drug–drug interaction prediction and can provide explanations that are consistent with pharmacologists.  相似文献   

12.
Warmr: a data mining tool for chemical data   总被引:5,自引:0,他引:5  
  相似文献   

13.
We present an efficient method to cluster large chemical databases in a stepwise manner. Databases are first clustered with an extended exclusion sphere algorithm based on Tanimoto coefficients calculated from Daylight fingerprints. Substructures are then extracted from clusters by iterative application of a maximum common substructure algorithm. Clusters with common substructures are merged through a second application of an exclusion sphere algorithm. In a separate step, singletons are compared to cluster substructures and added to a cluster if similarity is sufficiently high. The method identifies tight clusters with conserved substructures and generates singletons only if structures are truly distinct from all other library members. The method has successfully been applied to identify the most frequently occurring scaffolds in databases, for the selection of analogues of screening hits and in the prioritization of chemical libraries offered by commercial vendors.  相似文献   

14.
This paper investigates a computational procedure for the determination of the atom types on the vertices of a molecular skeleton to optimize interaction with the receptor site whilst maintaining a synthetically reasonable structure. The connectivity of the skeleton is analysed and appropriate atom types are compiled for each vertex. Receptor ionization and conformational states are generated by varying the positions of hydrogen atoms and electron lone pairs in the carboxyl, rotatable hydroxyl and amino groups. The structure is divided into small non-overlapping substructures. Atom types are assigned exhaustively onto each of the substructures using a depth-first search method; chemical rules are applied to reject unacceptable atom combinations early on. An empirical interaction score is calculated and the representatives of each partial structure are stored in ascending order according to their scores. The branch-and-bound procedure is then used to find the structures with the lowest scores. The method is illustrated using five protein–ligand complexes.  相似文献   

15.
In this paper we propose a new method based on measurements of the structural similarity for the clustering of chemical databases. The proposed method allows the dynamic adjustment of the size and number of cells or clusters in which the database is classified. Classification is carried out using measurements of structural similarity obtained from the matching of molecular graphs. The classification process is open to the use of different similarity indexes and different measurements of matching. This process consists of the projection of the obtained measures of similarity among the elements of the database in a new space of similarity. The possibility of the dynamic readjustment of the dimension and characteristic of the projection space to adapt to the most favorable conditions of the problem under study and the simplicity and computational efficiency make the proposed method appropriate for its use with medium and large databases. The clustering method increases the performance of the screening processes in chemical databases, facilitating the recovery of chemical compounds that share all or subsets of common substructures to a given pattern. For the realization of the work a database of 498 natural compounds with wide molecular diversity extracted from SPECS and BIOSPECS B.V. free database has been used.  相似文献   

16.
17.
The limits of a recently proposed computer method for finding all distinct substructures of a chemical structure are systematically explored within comprehensive graph samples which serve as supersets of the graphs corresponding to saturated hydrocarbons, both acyclic (up to n = 20) and (poly)cyclic (up to n = 10). Several pairs of smallest graphs and compounds are identified that cannot be distinguished using selected combinations of invariants such as combinations of Balaban's index J and graph matrix eigenvalues. As the most important result, it can now be stated that the computer program NIMSG, using J and distance eigenvalues, is safe within the domain of mono- through tetracyclic saturated hydrocarbon substructures up to n = 10 (oligocyclic decanes) and of all acyclic alkane substructures up to n = 19 (nonadecanes), i.e., it will not miss any of these substructures. For the regions surrounding this safe domain, upper limits are found for the numbers of substructures that may be lost in the worst case, and these are low. This taken together means that the computer program can be reasonably employed in chemistry whenever one is interested in finding the saturated hydrocarbon substructures. As to unsaturated and heteroatom containing substructures, there are reasons to conjecture that the method's resolving power for them is similar.  相似文献   

18.
Algorithms are described for correlating a proposed molecular structure with a mass spectrum. All molecular substructures of a proposed structure are determined which have the same masses as the fragment ions. The most likely fragment ion structures are those molecular substructures formed with the fewest number of bond cleavages in the proposed structure. The algorithms, which incorporate methods for handling rearrangement and adduct ions, utilize either nominal or exact data originating from any ionization method. The algorithms are demonstrated using the mass spectra of a substituted azetidinyl ketone and the macrolide antibiotic avermectin A1a.  相似文献   

19.
The multisubunit ligand 2 combines two complexation substructures known to undergo, with specific metal ions, distinct self-assembly processes to form a double-helical and a grid-type structure, respectively. The binding information contained in this molecular strand may be expected to generate, in a strictly predetermined and univocal fashion, two different, well-defined output inorganic architectures depending on the set of metal ions, that is, on the coordination algorithm used. Indeed, as predicted, the self-assembly of 2 with eight CuII and four CuI yields the intertwined structure D1. It results from a crossover of the two assembly subprograms and has been fully characterized by crystal structure determination. On the other hand, when the instructions of strand 2 are read out with a set of eight CuI and four MII (M = Fe, Co, Ni, Cu) ions, the architectures C1-C4, resulting from a linear combination of the two subprograms, are obtained, as indicated by the available physico-chemical and spectral data. Redox interconversion of D1 and C4 has been achieved. These results indicate that the same molecular information may yield different output structures depending on how it is processed, that is, depending on the interactional (coordination) algorithm used to read it. They have wide implications for the design and implementation of programmed chemical systems, pointing towards multiprocessing capacity, in a one code/ several outputs scheme, of potential significance for molecular computation processes and possibly even with respect to information processing in biology.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号