首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The maximum common subgraph (MCS) problem has become increasingly important in those aspects of chemoinformatics that involve the matching of 2D or 3D chemical structures. This paper provides a classification and a review of the many MCS algorithms, both exact and approximate, that have been described in the literature, and makes recommendations regarding their applicability to typical chemoinformatics tasks.  相似文献   

2.
This paper reports an evaluation of both graph-based and fingerprint-based measures of structural similarity, when used for virtual screening of sets of 2D molecules drawn from the MDDR and ID Alert databases. The graph-based measures employ a new maximum common edge subgraph isomorphism algorithm, called RASCAL, with several similarity coefficients described previously for quantifying the similarity between pairs of graphs. The effectiveness of these graph-based searches is compared with that resulting from similarity searches using BCI, Daylight and Unity 2D fingerprints. Our results suggest that graph-based approaches provide an effective complement to existing fingerprint-based approaches to virtual screening.  相似文献   

3.
The shortest common supersequence (SCS) problem is a classical NP-hard problem, which is normally solved by heuristic algorithms. One important heuristic that is inspired by the process of chemical reactions in nature is the chemical reaction optimization (CRO) and its algorithm known as CRO_SCS. In this paper we propose a novel CRO algorithm, dubbed IMCRO, to solve the SCS problem efficiently. Two new operators are introduced in two of the four reactions of the CRO: a new circular shift operator is added to the decomposition reaction, and a new two-step crossover operator is included in the inter-molecular ineffective collision reaction. Experimental results show that IMCRO achieves better performance on random and real sequences than well-known heuristic algorithms such as the ant colony optimization, deposition and reduction, enhanced beam search, and CRO_SCS. Additionally, it outperforms its baseline CRO_SCS for DNA instances, averaging a SCS length reduction of 1.02, with a maximum length reduction of up to 2.1.  相似文献   

4.
This paper reports a method for the identification of those molecules in a database of rigid 3D structures with molecular electrostatic potential (MEP) grids that are most similar to that of a user-defined target molecule. The most important features of an MEP grid are encoded in field-graphs, and a target molecule is matched against a database molecule by a comparison of the corresponding field-graphs. The matching is effected using a maximal common subgraph isomorphism algorithm, which provides an alignment of the target molecule's field- graph with those of each of the database molecules in turn. These alignments are used in the second stage of the search algorithm to calculate the intermolecular MEP similarities. Several different ways of generating field-graphs are evaluated, in terms of the effectiveness of the resulting similarity measures and of the associated computational costs. The most appropriate procedure has been implemented in an operational system that searches a corporate database, containing ca. 173,000 3D structures.  相似文献   

5.
A QSAR model for predicting the blood brain barrier permeability (BBBP) in a large and heterogeneous variety of compounds (136 compounds) has been developed using approximate similarity (AS) matrices as predictors and PLS as multivariate regression technique. AS values fuse information of both the isomorphic similarity and nonisomorphic dissimilarity with the purpose of achieving an accurate predictive space. In addition to the fact of applying AS values to heterogeneous data sets, a new concept on graph isomorphism based on the extended maximum common subgraph (EMCS) is defined for the building of AS spaces considering the atoms and bonds, which are bridges between the isomorphic and nonisomorphic substructures. This new isomorphism detection has as objective to take into account the position and nature of the nucleus substituents, thus allowing the development of accurate models for large and diverse sets of compounds. After an outliers study, the training and test stages were made and the results obtained using several AS approaches were compared. Several validation processes were carried out by means of employing several test sets, and high predictive ability was obtained for all the cases (Q(2) = 0.81 and standard error in prediction, SEP = 0.29).  相似文献   

6.
Summary We have developed a program, HookSpace, which provides a simplistic approach to assessing the diversity of molecular databases. The spatial relationship between pairs of intramolecular functional groups can be analysed in a variety of ways to provide both qualitative and quantitative measures of diversity. Results are described and contrasted for two commercially available databases and a combinatorial library of benzodiazepam derivatives. HookSpace highlights the main differences in molecular content of these data sets.  相似文献   

7.
In this paper, we propose a new method for clustering of chemical databases based on the representation of measurements of structural similarity onto multidimensional spaces. The proposed method permits the tuning of the clustering process through the selection of the dimension of the projection space, the normal vectors and the sensibility of the projection process. The structural similarity of each element regarding to the database elements is projected onto the defined spaces generating clusters that represent the characteristics and diversity of the database and whose size and characteristics can be easily adjusted.  相似文献   

8.
Summary This paper reports a comparison of several methods for measuring the degree of similarity between pairs of 3-D chemical structures that are represented by inter-atomic distance matrices. The methods that have been tested use the distance information in very different ways and have very different computational requirements. Experiments with 10 small datasets, for which both structural and biological activity data are available, suggest that the most cost-effective technique is based on a mapping procedure that tries to match pairs of atoms, one from each of the molecules that are being compared, that have neighbouring atoms at approximately the same distances.  相似文献   

9.
An arsenic chemical speciation study was performed in 2000, using air filters on which total suspended particles (TSP) were collected, from the city of Huelva, a medium size city with huge industrial influence in SW Spain. Different procedures for extraction of the arsenic species were performed using water, NH2OH.HCl, and H3PO4 solutions, with either microwave or ultrasonic radiation. The best optimised extraction methods were use of 100 mmol L–1 NH2OH.HCl and 10 mmol L–1 H3PO4 and microwave radiation for 4 min. High-performance liquid chromatography coupled with hydride generation and atomic fluorescence spectrometry (HPLC–HG–AFS) was employed for determination of the arsenic species. The results from 12 TSP air filters collected on a monthly basis showed extraction was quantitative (94% with NH2OH.HCl and 86% H3PO4). Only inorganic arsenic species (arsenite and arsenate) were detected. The mean arsenite concentration was 1.2±0.3 ng m–3 (minimum 0.3 ng m–3, maximum 1.8 ng m–3). The mean arsenate concentration was 10.4±1.8 ng m–3, with greater monthly variations than arsenite (minimum 2.1 ng m–3, maximum 30.6 ng m–3). The high level of arsenic species in the TSP samples can be related to a copper smelter located in the region.  相似文献   

10.
A variety of dipyrromethanes and dipyrromethenes have been prepared, and their 15N NMR chemical shifts have been measured by two-dimensional correlation to 1H NMR signals. The nitrogen atoms in five examples of dipyrromethanes consistently exhibit chemical shifts around -231 ppm, relative to nitromethane. Seven examples of hydrobromide salts of meso-unsubstituted dipyrromethenes consistently display 15N chemical shifts around -210 ppm, while their corresponding zinc(II) complexes exhibit chemical shifts around -170 ppm. The presence of electron-withdrawing substituents on one of the pyrrolic rings of dipyrromethenes affects the chemical shifts of both of the nitrogen nuclei in the molecule. Boron difluoride complexes of meso-unsubstituted dipyrromethenes display 15N chemical shifts around -190 ppm. Two examples of free-base dipyrromethenes bearing substituents at the meso-position exhibit 15N chemical shifts at approximately -156 ppm, and for the zinc complexes of these compounds at -162 ppm. One-bond nitrogen-hydrogen coupling constants, when measurable, were consistently in the range of -96 Hz. Since the measured 15N chemical shifts have such a high regularity correlated to structure, they can be used as diagnostic indications for identifying the structure of dipyrrolic compounds.  相似文献   

11.
Background: The translation or stability of the mRNAs from ferritin, m-aconitase, erythroid aminoevulinate synthase and the transferrin receptor is controlled by the binding of two iron regulatory proteins to a family of hairpin-forming RNA sequences called iron-responsive elements (IREs). The determination of higher-solution nuclear magnetic resonance (NMR) structures of IRE variants suggests an unusual hexaloop structure, leading to an intra-loop G-C base pair and a highly exposed loop guanine, and a special internal loop/bulge in the ferritin IRE involving a shift in base pairing not predicted with standard algorithms.Results: Cleavage of synthetic 55- and 30-mer RNA oligonucleotides corresponding to the ferritin IRE with complexes based on oxoruthenium(IV) shows enhanced reactivity at a hexaloop guanine and at a guanine adjacent to the internal loop/bulge with strong protection at a guanine in the internal loop/bulge. These results are consistent with the recent NMR structures. The synthetic 55-mer RNA binds the iron-regulatory protein from rabbit reticulocyte lysates. The DNA analogs of the 55- and 30-mers do not show the same reactivity pattern.Conclusions: The chemical reactivity of the guanines in the ferritin IRE towards oxoruthenium(IV) supports the published NMR structures and the known oxidation chemistry of the metal complexes, The results constitute progress towards developing stand-alone chemical nucleases that reveal significant structural properties and provide results that can ultimately be used to constrain molecular modeling.  相似文献   

12.
Five probes including four that contained isoprenoid chain were synthesized. These probes were assembled onto the gold-coated quartz crystal chips for analysis of their interactions with four yeast proteins by using the quartz crystal microbalance technology. Results showed that 3-phosphoglycerate phosphokinase and triosephosphate isomerase had clear interactions with certain probes, while glutathione reductase and phosphoglucose isomerase gave much lower interaction signals. It also suggested that 3-phosphoglycerate phosphokinase had two sites interacting with the probe attached with a geranyl moiety. Further molecule simulation experiments provided supportive information on these intermolecular interactions. Together, our data suggested that there are hydrophobic interactions, with relatively good selectivity, between isoprenoid chain and proteins.  相似文献   

13.
We investigate the classification performance of circular fingerprints in combination with the Naive Bayes Classifier (MP2D), Inductive Logic Programming (ILP) and Support Vector Inductive Logic Programming (SVILP) on a standard molecular benchmark dataset comprising 11 activity classes and about 102,000 structures. The Naive Bayes Classifier treats features independently while ILP combines structural fragments, and then creates new features with higher predictive power. SVILP is a very recently presented method which adds a support vector machine after common ILP procedures. The performance of the methods is evaluated via a number of statistical measures, namely recall, specificity, precision, F-measure, Matthews Correlation Coefficient, area under the Receiver Operating Characteristic (ROC) curve and enrichment factor (EF). According to the F-measure, which takes both recall and precision into account, SVILP is for seven out of the 11 classes the superior method. The results show that the Bayes Classifier gives the best recall performance for eight of the 11 targets, but has a much lower precision, specificity and F-measure. The SVILP model on the other hand has the highest recall for only three of the 11 classes, but generally far superior specificity and precision. To evaluate the statistical significance of the SVILP superiority, we employ McNemar's test which shows that SVILP performs significantly (p < 5%) better than both other methods for six out of 11 activity classes, while being superior with less significance for three of the remaining classes. While previously the Bayes Classifier was shown to perform very well in molecular classification studies, these results suggest that SVILP is able to extract additional knowledge from the data, thus improving classification results further.  相似文献   

14.
A molecular modeling review of the X-ray crystallographically determined structures of some proteins and polypeptides, from the Brookhaven Protein Data Bank, has enabled us to identify chemically reactive, weak, amidic linkages in some of these molecules. This discovery should add new dimensions to the discussion of the significance of the tertiary structures of proteins and polypeptides, and to the chemistry of these polymers.  相似文献   

15.
An efficient structure filtration method for the operation with chemical databases containing information on the structures and properties of organic molecules was proposed. The technique involves the use of electronegativity indices for generation of identification keys and for isomorphism tests of the molecular graphs corresponding to the structural formulas. The test set for the method proposed included a total of 95,000,000 molecules containing up to sixty carbon atoms. Tests revealed a high discriminating capability of the electronegativity indices and high efficiency of the method for solving both general problems (recognition of chemical structures, chemical database management systems) and specific tasks (generation of molecular graphs, etc.) in chemical informatics. Dedicated to Academician N. S. Zefirov on the occasion of his 70th birthday. Published in Russian in Izvestiya Akademii Nauk. Seriya Khimicheskaya, No. 9, pp. 2166–2176, September, 2005.  相似文献   

16.
17.
18.
Abstract

A new approach for virtual characterization of the active site structure of enzymes with unknown three-dimensional (3D) structure has been proposed. It includes analysis of data on enzyme interaction with reversible competitive inhibitors, their 3D structures and moulding of the substrate-binding region. The superposition of ligands in biologically active conformations allows to determine the shape and dimension of the active site cavity accommodating these compounds. Monoamine oxidase A (MAO-A), a “typical” enzyme with unknown spatial organisation, was used to test this method. The correctness of such approach was validated by the analysis of HIV protease interaction with its inhibitors using 3D structures of their complexes. Mould of the substrate/inhibitor binding site can be used for the visualization of this binding site and for searching new ligands in molecular databases.  相似文献   

19.
The chemical composition obviously affects the surface wettability of a three-dimensional(3D) graphene material apart from its surface energy and microstructure.In the hydrothermal preparation,the heteroatom doping changes the chemical composition and wettability of the 3D graphene material.To realize the controllable surface wettability of graphene materials,aminobenzene sulfonic acid(ABSA)was selected as a typical doping agent for the preparation of nitrogen and sulfur co-doped 3D graphene foa...  相似文献   

20.
A quantum chemical method for rapid optimization of protein structures is proposed. In this method, a protein structure is treated as an assembly of amino acid units, and the geometry optimization of each unit is performed with taking the effect of its surrounding environment into account. The optimized geometry of a whole protein is obtained by repeated application of such a local optimization procedure over the entire part of the protein. Here, we implemented this method in the MOPAC program and performed geometry optimization for three different sizes of proteins. Consequently, these results demonstrate that the total energies of the proteins are much efficiently minimized compared with the use of conventional optimization methods, including the MOZYME algorithm (a representative linear-scaling method) with the BFGS routine. The proposed method is superior to the conventional methods in both CPU time and memory requirements.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号