首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have systematically enumerated graph representations of scaffold topologies for up to eight-ring molecules and four-valence atoms, thus providing coverage of the lower portion of the chemical space of small molecules (Pollock et al. J. Chem. Inf. Model., this issue). Here, we examine scaffold topology distributions for several databases: ChemNavigator and PubChem for commercially available chemicals, the Dictionary of Natural Products, a set of 2742 launched drugs, WOMBAT, a database of medicinal chemistry compounds, and two subsets of PubChem, "actives" and DSSTox comprising toxic substances. We also examined a virtual database of exhaustively enumerated small organic molecules, GDB (Fink et al. Angew. Chem., Int. Ed. 2005, 44, 1504-1508), and we contrast the scaffold topology distribution from these collections to the complete coverage of up to eight-ring molecules. For reasons related, perhaps, to synthetic accessibility and complexity, scaffolds exhibiting six rings or more are poorly represented. Among all collections examined, PubChem has the greatest scaffold topological diversity, whereas GDB is the most limited. More than 50% of all entries (13 000 000+ actual and 13 000 000+ virtual compounds) exhibit only eight distinct topologies, one of which is the nonscaffold topology that represents all treelike structures. However, most of the topologies are represented by a single or very small number of examples. Within topologies, we found that three-way scaffold connections (3-nodes) are much more frequent compared to four-way (4-node) connections. Fused rings have a slightly higher frequency in biologically oriented databases. Scaffold topologies can be the first step toward an efficient coarse-grained classification scheme of the molecules found in chemical databases.  相似文献   

2.
3.
4.
Generative topographic mapping (GTM) has been used to visualize and analyze the chemical space of antimalarial compounds as well as to build predictive models linking structure of molecules with their antimalarial activity. For this, a database, including ~3000 molecules tested in one or several of 17 anti-Plasmodium activity assessment protocols, has been compiled by assembling experimental data from in-house and ChEMBL databases. GTM classification models built on subsets corresponding to individual bioassays perform similarly to the earlier reported SVM models. Zones preferentially populated by active and inactive molecules, respectively, clearly emerge in the class landscapes supported by the GTM model. Their analysis resulted in identification of privileged structural motifs of potential antimalarial compounds. Projection of marketed antimalarial drugs on this map allowed us to delineate several areas in the chemical space corresponding to different mechanisms of antimalarial activity. This helped us to make a suggestion about the mode of action of the molecules populating these zones.  相似文献   

5.
Consideration of stereochemistry early in the identification and optimization of lead compounds can improve the efficiency and efficacy of the drug discovery process and reduce the time spent on subsequent drug development. These improvements can result by focusing on specific enantiomers that have the desired potential therapeutic effect (eutomers), while removing from consideration enantiomers that may have no, or even undesirable, effects (distomers). A virtual screening campaign that correctly takes stereochemical information into account can, in theory, be utilized to provide information about the relative binding affinities of enantiomers. Thus, the proper enumeration of the relevant stereoisomers in general, and enantiomeric pairs in particular, of chiral compounds is crucial if one is to use virtual screening as an effective drug discovery tool. As is obvious, in cases where no stereochemical information is provided for chiral compounds in a 2D chemical database, then each possible stereoisomer should be generated for construction of the subsequent 3D database to be used for virtual screening. However, acute problems can arise in 3D database construction when relative stereochemistry is encoded in a 2D database for a chiral compound containing multiple stereogenic atoms but absolute stereochemistry is not implied. In this case, we report that generation of enantiomeric pairs is imperative in database development if one is to obtain accurate docking results. A study is described on the impact of the neglect of enantiomeric pairs on virtual screening using the human homolog of murine double minute 2 (MDM2) protein, the product of a proto-oncogene, as the target. Docking in MDM2 with GLIDE 4.0 was performed using the NCI Diversity Set 3D database and, for comparison, a set of enantiomers we created corresponding to mirror image structures of the single enantiomers of chiral compounds present in the NCI Diversity Set. Our results demonstrate that potential lead candidates may be overlooked when databases contain 3D structures representing only a single enantiomer of racemic chiral compounds.  相似文献   

6.
Although chemical phenomena are primarily associated with electrons in atoms, ions, and molecules, the masses, charges, spins, and other properties of the nuclei in these species contribute significantly as well. Isotopes, for instance, have proven invaluable in chemistry, in particular the elucidation of reaction mechanisms. Elements with unstable nuclei, for example carbon-14 undergoing beta decay, have enriched chemistry and many other scientific disciplines. The nuclei of all elements have a much more subtle and largely unknown effect on chemical phenomena. All nuclei are innately chiral and, because electrons can penetrate nuclei, all atoms and molecules are likewise chiral. This article describes in considerable detail the discovery of chiral nuclei, how this unusual chirality may influence the chemical behavior of atoms and molecules, and how atomic chirality may have been responsible for the synthesis of optically active molecules in the pre-biotic world.  相似文献   

7.
An approximate method for calculating molecular electrostatic potential (MEP) maps and atomic point charge models for large molecules in a reduced computational time is proposed and tested for two widely used basis sets (STO-3G and 6–31G*). The method avoids the molecular orbital calculation of the whole system by expressing its first order electronic density matrix in terms of transferable localized orbitals (TLO), previously determined on model molecules, via a localization process followed by the cutting of the tails, and stored in two databases (one for each basis set). For systems with a canonic electronic structure TLO are made of a single vector, involving either two nuclei (to describe the covalent bond between those atoms) or one nucleus (to describe lone pairs and inner shells). Conversely, delocalized systems require many-center TLO, formed by a suitable number of vectors. Density functions of large chemical compounds can thus be built up automatically from a code that recognizes which fragments are contained in the system of interest, extracts them from the chosen database, reorders the atoms consistently with the pertinent TLO and places them in the correct position and orientation on the relevant atoms. A great number of chemical groups were parameterized and the efficiency of the method was evaluated on different systems, including aliphatic hydrocarbons. Numerical calculations on several molecules revealed that this approximation brought no significant loss of accuracy with respect to the corresponding Hartree-Fock (HF) values for the examined properties. Although the method is specifically designed to produce approximate wavefunctions, the point charge models obtained by fitting the corresponding MEP represent a viable alternative when ab initio HF calculations are not affordable, and can be used in connection with any popular force field.From the Proceedings of the 28th Congreso de Químicos Teóricos de Expresión Latina (QUITEL 2002)  相似文献   

8.
Protein-protein interactions are central to most biological processes and represent a large and important class of targets for human therapeutics. Small molecules containing peptide substituents may mimic regions of interacting proteins and inhibit their interactions. We set out to develop efficient methods to screen for similarities between known peptide structures within proteins and small molecules. We developed a method to rank peptide-compound similarities, that is restricted to small linear motifs in proteins, and to compounds containing amino acid substituents. Application to a search of the PubChem database (5.4 million compounds) using all short motifs on accessible surface areas in a nonredundant set of 11 488 peptides from the protein structure database PDB demonstrated the feasibility of the method for high throughput comparisons and the availability of compounds with comparable substituents: over 6 million compound-peptide pairs shared at least three amino acid substituents, approximately 100 000 of which had an rmsd score of less than 1 A. A Z-score function was developed that compares matches of a compound to different instances of the peptide motif in PDB, providing an appropriate scoring function for comparison among peptide-compound similarities involving different numbers of atoms (while simultaneously enriching for similarities that are likely to be more specific for the protein of interest). We applied the method to searches of known short protein motifs against the National Cancer Institute Developmental Therapeutic Program compound database, identifying a known true positive.  相似文献   

9.
10.
A new type of molecular representation is introduced that is based on activity class characteristic substructures extracted from random fragment populations. Mapping of characteristic substructures is used to determine atom match rates in active molecules. Comparison of match rates of bonded atoms defines a hierarchical molecular fragmentation scheme. Active compounds are encoded as fragmentation pathways isolated from core trees. These paths are amenable to biological sequence alignment methods in combination with substructure-based scoring functions. From multiple core path alignments, consensus fragment sequences are derived that represent compound activity classes. Consensus fragment sequences weighted by increasing structural specificity can also be used to map molecules and search databases for active compounds.  相似文献   

11.
An efficient structure filtration method for the operation with chemical databases containing information on the structures and properties of organic molecules was proposed. The technique involves the use of electronegativity indices for generation of identification keys and for isomorphism tests of the molecular graphs corresponding to the structural formulas. The test set for the method proposed included a total of 95,000,000 molecules containing up to sixty carbon atoms. Tests revealed a high discriminating capability of the electronegativity indices and high efficiency of the method for solving both general problems (recognition of chemical structures, chemical database management systems) and specific tasks (generation of molecular graphs, etc.) in chemical informatics. Dedicated to Academician N. S. Zefirov on the occasion of his 70th birthday. Published in Russian in Izvestiya Akademii Nauk. Seriya Khimicheskaya, No. 9, pp. 2166–2176, September, 2005.  相似文献   

12.
13.
基于分子和原子的高选择性拓扑指数, 提出了化学键的高选择性拓扑指数bATID. 分别采用300余万个化学键的虚拟数据集和实际数据集检验bATID的唯一性, 未发现简并, 即bATID具有较强的化学键区分能力. 进一步将bATID应用于有机化合物的化学键识别, 获得了较好结果. 如, 利用bATID可识别出富勒烯C60的90个化学键为30个6∶6键和60个5∶6键. 研究还表明, bATID的化学键识别可应用于手性中心自动设定和自同构群穷举生成的顶点置换.  相似文献   

14.
15.
16.
Target identification is a critical step following the discovery of small molecules that elicit a biological phenotype. The present work seeks to provide an in silico correlate of experimental target fishing technologies in order to rapidly fish out potential targets for compounds on the basis of chemical structure alone. A multiple-category Laplacian-modified na?ve Bayesian model was trained on extended-connectivity fingerprints of compounds from 964 target classes in the WOMBAT (World Of Molecular BioAcTivity) chemogenomics database. The model was employed to predict the top three most likely protein targets for all MDDR (MDL Drug Database Report) database compounds. On average, the correct target was found 77% of the time for compounds from 10 MDDR activity classes with known targets. For MDDR compounds annotated with only therapeutic or generic activities such as "antineoplastic", "kinase inhibitor", or "anti-inflammatory", the model was able to systematically deconvolute the generic activities to specific targets associated with the therapeutic effect. Examples of successful deconvolution are given, demonstrating the usefulness of the tool for improving knowledge in chemogenomics databases and for predicting new targets for orphan compounds.  相似文献   

17.
18.
Chemical fingerprints are used to represent chemical molecules by recording the presence or absence, or by counting the number of occurrences, of particular features or substructures, such as labeled paths in the 2D graph of bonds, of the corresponding molecule. These fingerprint vectors are used to search large databases of small molecules, currently containing millions of entries, using various similarity measures, such as the Tanimoto or Tversky's measures and their variants. Here, we derive simple bounds on these similarity measures and show how these bounds can be used to considerably reduce the subset of molecules that need to be searched. We consider both the case of single-molecule and multiple-molecule queries, as well as queries based on fixed similarity thresholds or aimed at retrieving the top K hits. We study the speedup as a function of query size and distribution, fingerprint length, similarity threshold, and database size |D| and derive analytical formulas that are in excellent agreement with empirical values. The theoretical considerations and experiments show that this approach can provide linear speedups of one or more orders of magnitude in the case of searches with a fixed threshold, and achieve sublinear speedups in the range of O(|D|0.6) for the top K hits in current large databases. This pruning approach yields subsecond search times across the 5 million compounds in the ChemDB database, without any loss of accuracy.  相似文献   

19.
A theoretical approach is developed to pre-select individual polycyclic aromatic hydrocarbons (PAHs) as possible carriers of the diffuse interstellar bands (DIBs). In this approach, a computer program is used to enumerate all PAH molecules with up to a specific number of fused benzene rings. Fast quantum chemical calculations are then employed to calculate the electronic transition energies, oscillator strengths, and rotational constants of these molecules. An electronic database of all PAHs with up to any specific number of benzene rings can be constructed this way. Comparison of the electronic transition energies, oscillator strengths, and rotational band contours of all PAHs in the database with astronomical spectra allows one to constrain the identities of individual PAHs as possible carriers of some of the intense narrow DIBs. Using the current database containing up to 10 benzene rings we have pre-selected 8 closed-shell PAHs as possible carriers of the famous lambda6614 DIB.  相似文献   

20.
31P-NMR chemical shifts and coupling constants of nine inorganic phosphorus compounds composed of different structural units or oxidation numbers PV, PIV, PIII, and PI were measured in the pH-range 3 11. A concise map of NMR data providing the pH-dependence of the chemical shift (-pH map) was set up to be used for identifying phosphorus compounds under varying pH-conditions. Chemical shifts of monofluorophosphate, as well as most phosphorus compounds of oxidation numbers 5 and 4, were greatly dependent on pH, in contrast to the less or negligible pH-dependence of phosphorus compounds of oxidation numbers 1 and 3. Monofluorophosphate gave the parameters: =+1.3±0.2 ppm and 1JPF=870±0.2 Hz, that remained unchanged at pH>6, but varied at pH<6. The practical use of the -pH map was shown with a few kinetic experiments in which monofluorophosphate was enzymatically hydrolyzed by alkaline phosphatase (EC3.1.3.1) at pH 7.2 and non-enzymatically at pH 3.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号