首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Given a database D of three-dimensional (3D) molecular structures and a target molecule Q, the similarity search problem is to find the molecules O in D that match Q after allowing for an arbitrary number of whole-structure rotations and translations as well as a certain number of edit operations. The edit operations include relabeling an atom, deleting an atom, and inserting an atom. This search operation arises in many biochemical applications. In this paper we study the similarity search problem and a class of related queries. We present a computer vision based technique, called geometric hashing, for processing these queries. Experimental results on a database of 3D molecular structures obtained from the National Cancer Institute indicate the good performance of the presented technique.  相似文献   

2.
Summary In this paper a database of atomic residual charges has been constructed for all the molecular fragments defined previously in a combinatorial search of the Cambridge Structural Database. The charges generated for the atoms in each fragment are compared with charges calculated for whole molecules containing those fragments. The fragment atomic charges lie within 1 S.D. of the mean for 68%, and within 2 S.D. for 91%, of the atoms whose charges were computed for whole molecules. The actual charges on any atom are strongly influenced by the adjacent connected atoms. There is a large spread of atomic residual charge within the fragments database.  相似文献   

3.
The potential of the combined use of ESI–QqTOF-MS and ESI–QqTOF-MS/MS with mass-spectral library search for the identification of therapeutic and illicit drugs has been evaluated. Reserpine was used for standardizing experimental conditions and for characterization of the performance of the applied mass spectrometric system. Experiments revealed that because of the mass accuracy, the stability of calibration, and the reproducibility of fragmentation, the QqTOF mass spectrometer is an appropriate platform for establishment of a tandem-mass-spectral library. Three-hundred and nineteen substances were used as reference samples to build the spectral library. For each reference compound, product-ion spectra were acquired at ten different collision-energy values between 5 eV and 50 eV. For identification of unknown compounds, a library search algorithm was developed. The closeness of matching between a measured product-ion spectrum and a spectrum stored in the library was characterized by a value called “match probability”, which took into account the number of matched fragment ions, the number of fragment ions observed in the two spectra, and the sum of the intensity differences calculated for matching fragments. A large value for the match probability indicated a close match between the measured and the reference spectrum. A unique feature of the library search algorithm—an implemented spectral purification option—enables characterization of multi-contributor fragment-ion spectra. With the aid of this software feature, substances comprising only 1.0% of the total amount of binary mixtures were unequivocally assigned, in addition to the isobaric main contributors. The spectral library was successfully applied to the characterization of 39 forensic casework samples. Electronic supplementary material Supplementary material is available in the online version of this article at and is accessible to authorized users.  相似文献   

4.
5.
Herein, we describe a method to flexibly align molecules (FLAME = FLexibly Align MolEcules). FLAME aligns two molecules by first finding maximum common pharmacophores between them using a genetic algorithm. The resulting alignments are then subjected to simultaneous optimizations of their internal energies and an alignment score. The utility of the method in pairwise alignment, multiple molecule flexible alignment, and database searching was examined. For pairwise alignment, two carboxypeptidase ligands (Protein Data Bank codes and ), two estrogen receptor ligands ( and ), and two thrombin ligands ( and ) were used as test sets. Alignments generated by FLAME starting from CONCORD structures compared very well to the X-ray structures (average root-mean-square deviation = 0.36 A) even without further minimization in the presence of the protein. For multiple flexible alignments, five structurally diverse D3 receptor ligands were used as a test set. The FLAME alignment automatically identified three common pharmacophores: a base, a hydrogen-bond acceptor, and a hydrophobe/aromatic ring. The best alignment was then used to search the MDDR database. The search results were compared to the results using atom pair and Daylight fingerprint similarity. A similar database search comparison was also performed using estrogen receptor modulators. In both cases, hits identified by FLAME were structurally more diverse compared to those from the atom pair and Daylight fingerprint methods.  相似文献   

6.
Summary If atom assignment onto 3D molecular graphs is to be optimized, an efficient scheme for placement must be developed. The strategy adopted in this paper is to analyze the molecular graphs in terms of cyclical and non-cyclical nodes; the latter are further divided into terminal and non-terminal nodes. Molecular fragments, from a fragments database, are described in a similar way. A canonical numbering scheme for the fragments and the local subgraph of the molecular graph enables fragments to be placed efficiently onto the molecular graph. Further optimization is achieved by placing similar fragments into bins using a hashing scheme based on the canonical numbering. The graph perception algorithm is illustrated in detail.  相似文献   

7.
8.
An extension of the Kick program developed by Bera et al. (J Phys Chem A 2006, 110, 4287) is described in which chemically sensible molecular fragments are used in an automated stochastic search algorithm. This results in a vastly reduced region of the potential energy surface which can be explored very quickly. We present use of this modified algorithm to the search for low-lying isomers, and we present candidates for the global energy minimum, for a range of chemical systems. We highlight the usefulness of this procedure for exploring reactions of molecules with transition metal clusters and to the microsolvation of a small dipeptide.  相似文献   

9.
10.
A procedure for monitoring the identification performance of GC–MS instrumentation as applied to herbal remedies was established, using eugenol (extracted from cloves) as a control compound. The following parameters were monitored: retention time (acceptable variability 0.5%); signal-to-noise ratio (at least 40% of the initial value); ion intensity ratio (acceptable variability 20%); and identity search result (reverse match, with minimum match value of 850 for quadrupole instruments and 800 for ion trap instrument). Other candidates for control compounds (pulegone, caffeine, and methoxsalen) as well as other parameters (relative retention time, second ion intensity ratio, peak area, and direct match) did not give any additional information concerning variability, observed trends, and sensitivity.  相似文献   

11.
GANDI (Genetic Algorithm-based de Novo Design of Inhibitors) is a computational tool for automatic fragment-based design of molecules within a protein binding site of known structure. A genetic algorithm and a tabu search act in concert to join predocked fragments with a user-supplied list of fragments. A novel feature of GANDI is the simultaneous optimization of force field energy and a term enforcing 2D-similarity to known inhibitor(s) or 3D-overlap to known binding mode(s). Scaffold hopping can be promoted by tuning the relative weights of these terms. The performance of GANDI is tested on cyclin-dependent kinase 2 (CDK2) using a library of about 14 000 fragments and the binding mode of a known oxindole inhibitor to bias the design. Top ranking GANDI molecules are involved in one to three hydrogen bonds with the backbone polar groups in the hinge region of CDK2, an interaction pattern observed in potent kinase inhibitors. Notably, a GANDI molecule with very favorable predicted binding affinity shares a 2-N-phenyl-1,3-thiazole-2,4-diamine moiety with a known nanomolar inhibitor of CDK2. Importantly, molecules with a favorable GANDI score are synthetic accessible. In fact, eight of the 1809 molecules designed by GANDI for CDK2 are found in the ZINC database of commercially available compounds which also contains about 600 compounds with identical scaffolds as those in the top ranking GANDI molecules.  相似文献   

12.
13.
Summary We present a system, FLOG (Flexible Ligands Oriented on Grid), that searches a database of 3D coordinates to find molecules complementary to a macromolecular receptor of known 3D structure. The philosophy of FLOG is similar to that reported for DOCK [Shoichet, B.K. et al., J. Comput. Chem., 13 (1992) 380]. In common with that system, we use a match center representation of the volume of the binding cavity and we use a clique-finding algorithm to generate trial orientations of each candidate ligand in the binding site. Also we use a grid representation of the receptor to assess the fit of each orientation. We have introduced a number of novel features within this paradigm. First, we address ligand flexibility by including up to 25 explicit conformations of each structure in our databases. Nonhydrogen atoms in each database entry are assigned one of seven atom types (anion, cation, donor, acceptor, polar, hydrophobic and other) based on their local bonded chemical environments. Second, we have devised a new grid-based scoring function compatible with this heavy atom representation of the ligands. This includes several potentials (electrostatic, hydrogen bonding, hydrophobic and van der Waals) calculated from the location of the receptor atoms. Third, we have improved the fitting stage of the search. Initial dockings are generated with a more efficient clique-finding algorithm. This new algorithm includes the concept of essential points, match centers that must be paired with a ligand atom. Also, we introduce the use of a rapid simplex-based rigid-body optimizer to refine the orientations. We demonstrate, using dihydrofolate reductase as a sample receptor, that the FLOG system can select known inhibitors from a large database of drug-like compounds.  相似文献   

14.
StrucEluc is an expert system that allows the computer-assisted elucidation of chemical structures based on the inputs of a series of spectral data including 1D and 2D NMR and mass spectra. The system has been enabled to allow a chemist to utilize fragments stored in a fragment database as well as user-defined fragments submitted by the chemist in the structure elucidation process. The association of fragments in this way has been shown to dramatically speed up the process of structure generation from 2D NMR data and has helped to minimize or eliminate the need for user intervention thereby further enabling the vision of automated elucidation. The use of fragments has frequently transformed very difficult 2D NMR elucidation challenges into easily solvable tasks. A strategy to utilize molecular fragments has been developed and optimized based on specific challenging examples. This strategy will be described here using real world examples. Experience gained by solving more than 150 structure elucidation problems from a variety of literature sources is also reviewed in this work.  相似文献   

15.
A method is proposed, on the basis of a recently developed algorithm--Band Target Entropy Minimization (BTEM)--to reconstruct mass spectra of pure components from mixture spectra. This method is particular useful in dealing with spectral data with discrete features (like mass spectra). Compared to the original BTEM, which has been applied to differentiable spectroscopies such as Fourier-transfer infrared spectroscopy (FTIR), ultraviolet (UV), Raman, and nuclear magnetic resonance (NMR), the latest modifications were obtained through: (1) Reformulating the objective function using the peak heights instead of their derivatives; (2) weighting the abstract vector VT to reduce the effect of noise; (3) using a two-peak targeting strategy (tBTEM) to deal with strongly overlapping peaks; and (4) using exhaustive search to locate all the component spectra. A set of 50 multi-component mass spectra was generated from ten reference experimental pure component spectra. Many of the compounds chosen have common MS fragments and therefore, many of the pure component spectra have considerable intensity in same data channels. In addition, a set of MS spectra from a real system with four components was used to examine the newly developed algorithm. Successful reconstruction of the ten component spectra of the simulated system and the four component spectra of the real system was rapidly achieved using the new tBTEM algorithm. The advantages of the new algorithm and its implication for rapid system identification of unknown mixtures are readily apparent.  相似文献   

16.
A new method for the computerized search and identification of infrared spectra has been developed and evaluated. Based on cross-correlation, the search system utilizes all spectral information in a digitized spectrum when it attempts to match an unknown spectrum to one in a small library of known spectra. To evaluate a spectral match, the search program calculates the cross-correlation function between the unknown and known (library) spectra which indicates their degree of similarity and allows library spectra to be ranked in order of probability of match to the unknown spectrum. In this study, several small infrared spectral libraries of structurally similar compounds were searched under conditions which examined the sensitivity of the search method to chemical and instrumental variations. Because the correlation technique is slower than conventional file-searching methods, it will probably find greatest use in the search of small collections of similar spectra or as a match-ranking procedure following preliminary selection by a faster search method.  相似文献   

17.
Summary If a method is to be developed to assemble putative ligands structures in site-directed drug design, from molecular graphs generated in the site, then basic building blocks are needed. Structure assembly is a combinatoric process that needs to be optimised if it is to be tractable. What has to be determined is whether small molecular fragments can have transferable properties from one molecule to another. In this paper we determine all possible combinations of 3-, 4- and 5-atom aliphatic fragments from a small set of atoms H, C, N, O, F or Cl. The frequency of occurrence of these candidate fragments is searched for in the Cambridge Structural Database. A similar analysis is performed on charged fragments. A more restricted search is carried out for P and S and aromatic structures. A basic set of fragments can be derived that have a significant frequency in known crystal structures. The transferability of fragment properties is discussed in subsequent papers.  相似文献   

18.
19.
A tandem mass spectral database system consists of a library of reference spectra and a search program. State‐of‐the‐art search programs show a high tolerance for variability in compound‐specific fragmentation patterns produced by collision‐induced decomposition and enable sensitive and specific ‘identity search’. In this communication, performance characteristics of two search algorithms combined with the ‘Wiley Registry of Tandem Mass Spectral Data, MSforID’ (Wiley Registry MSMS, John Wiley and Sons, Hoboken, NJ, USA) were evaluated. The search algorithms tested were the MSMS search algorithm implemented in the NIST MS Search program 2.0g (NIST, Gaithersburg, MD, USA) and the MSforID algorithm (John Wiley and Sons, Hoboken, NJ, USA). Sample spectra were acquired on different instruments and, thus, covered a broad range of possible experimental conditions or were generated in silico. For each algorithm, more than 30 000 matches were performed. Statistical evaluation of the library search results revealed that principally both search algorithms can be combined with the Wiley Registry MSMS to create a reliable identification tool. It appears, however, that a higher degree of spectral similarity is necessary to obtain a correct match with the NIST MS Search program. This characteristic of the NIST MS Search program has a positive effect on specificity as it helps to avoid false positive matches (type I errors), but reduces sensitivity. Thus, particularly with sample spectra acquired on instruments differing in their setup from tandem‐in‐space type fragmentation, a comparably higher number of false negative matches (type II errors) were observed by searching the Wiley Registry MSMS. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

20.
The DOCK program explores possible orientations of a molecule within a macromolecular active site by superimposing atoms onto precomputed site points. Here we compare a number of different search methods, including an exhaustive matching algorithm based on a single docking graph. We evaluate the performance of each method by screening a small database of molecules to a variety of macromolecular targets. By varying the amount of sampling, we can monitor the time convergence of scores and rankings. We not only show that the site point–directed search is tenfold faster than a random search, but that the single graph matching algorithm boosts the speed of database screening up to 60-fold. The new algorithm, in fact, outperforms the bipartite graph matching algorithm currently used in DOCK. The results indicate that a critical issue for rapid database screening is the extent to which a search method biases run time toward the highest-ranking molecules. The single docking graph matching algorithm will be incorporated into DOCK version 4.0. © 1997 John Wiley & Sons, Inc. J Comput Chem 18: 1175–1189  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号