首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
In most cheminformatics workflows, chemical information is stored in files which provide the necessary data for subsequent calculations. The correct interpretation of the file formats is an important prerequisite to obtain meaningful results. Consistent reading of molecules from files, however, is not an easy task. Each file format implicitly represents an underlying chemical model, which has to be taken into consideration when the input data is processed. Additionally, many data sources contain invalid molecules. These have to be identified and either corrected or discarded. We present the chemical file format converter NAOMI, which provides efficient procedures for reliable handling of molecules from the common chemical file formats SDF, MOL2, and SMILES. These procedures are based on a consistent chemical model which has been designed for the appropriate representation of molecules relevant in the context of drug discovery. NAOMI's functionality is tested by round robin file IO exercises with public data sets, which we believe should become a standard test for every cheminformatics tool.  相似文献   

3.
Computer-aided drug discovery started at Albany Molecular Research, Inc in 1997. Over nearly 20 years the role of cheminformatics and computational chemistry has grown throughout the pharmaceutical industry and at AMRI. This paper will describe the infrastructure and roles of CADD throughout drug discovery and some of the lessons learned regarding the success of several methods. Various contributions provided by computational chemistry and cheminformatics in chemical library design, hit triage, hit-to-lead and lead optimization are discussed. Some frequently used computational chemistry techniques are described. The ways in which they may contribute to discovery projects are presented based on a few examples from recent publications.  相似文献   

4.
The recent advances in laboratory technologies have resulted in a wealth of chemical and biological data. The rapid proliferation of a vast amount of data has led to a set of cheminformatics and bioinformatics applications that manipulate dynamic, heterogeneous, and massive data. An example of such application in the pharmaceutical industry is the computational process involved in the early discovery of lead drug candidates for a given target disease. In this paper, an efficient implementation of a drug candidate database is presented and evaluated. This study shows that high performance data access can be achieved through proper choices of data representation, database schema design, and parallel processing techniques.  相似文献   

5.
Since 2009 the Royal Society of Chemistry (RSC) has been delivering access to chemistry data and cheminformatics tools via the ChemSpider database and has garnered a significant community following in terms of usage and contribution to the platform. ChemSpider has focused only on those chemical entities that can be represented as molecular connection tables or, to be more specific, the ability to generate an InChI from the input structure. As a structure centric hub ChemSpider is built around the molecular structure with other data and links being associated with this structure. As a result the platform has been limited in terms of the types of data that can be managed, and the flexibility of its searches, and it is constrained by the data model. New technologies and approaches, specifically taking into account a shift from relational to NoSQL databases, and the growing importance of the semantic web, has motivated RSC to rearchitect and create a more generic data repository utilizing these new technologies. This article will provide an overview of our activities in delivering data sharing platforms for the chemistry community including the development of the new data repository expanding into more extensive domains of chemistry data.  相似文献   

6.
Most data structures used to represent molecular entities for cheminformatics are underspecified for purposes of representing nonorganic chemical species. Two extensions are proposed: allowing bond orders of 0 and adding an atom property to control the number of inferred attached hydrogen atoms. The case for these two extensions is made by demonstrating the effective representation of a number of unconventional bonding types that cannot be effectively represented by data structures currently in common use. A set of enhancements to the industry standard MDL CTfile format is proposed, which includes a backward compatibility mechanism to maximize interpretability by software that has not been updated to make use of the extensions.  相似文献   

7.
(1) Background: The research aims to find new treatments for neurodegenerative diseases, in particular, Alzheimer’s disease. (2) Methods: This article presents a bioinformatics and pathology study of new Schiff bases, (EZ)-N′-benzylidene-(2RS)-2-(6-chloro-9H-carbazol-2-yl)propanehydrazide derivatives, and aims to evaluate the drug-like, pharmacokinetic, pharmacodynamic and pharmacogenomic properties, as well as to predict the binding to therapeutic targets by applying bioinformatics, cheminformatics and computational pharmacological methods. (3) Results: We obtained these Schiff bases by condensing (2RS)-2-(6-chloro-9H-carbazol-2-yl)propanehydrazide with aromatic aldehydes, using the advantages of microwave irradiation. The newly synthesized compounds were characterized spectrally, using FT-IR and NMR spectroscopy, which confirmed their structure. Using bioinformatics tools, we noticed that all new compounds are drug-likeness features and may be proposed as potentially neuropsychiatric drugs (4) Conclusions: Using bioinformatics tools, we determined that the new compound 1e had a high potential to be used as a good candidate in neurodegenerative disorders treatment.  相似文献   

8.
Integration of flexible data-analysis tools with cheminformatics methods is a prerequisite for successful identification and validation of “hits” in high-throughput screening (HTS) campaigns. We have designed, developed, and implemented a suite of robust yet flexible cheminformatics tools to support HTS activities at the Broad Institute, three of which are described herein. The “hit-calling” tool allows a researcher to set a hit threshold that can be varied during downstream analysis. The results from the hit-calling exercise are reported to a database for record keeping and further data analysis. The “cherry-picking” tool enables creation of an optimized list of hits for confirmatory and follow-up assays from an HTS hit list. This tool allows filtering by computed chemical property and by substructure. In addition, similarity searches can be performed on hits of interest and sets of related compounds can be selected. The third tool, an “S/SAR viewer,” has been designed specifically for the Broad Institute’s diversity-oriented synthesis (DOS) collection. The compounds in this collection are rich in chiral centers and the full complement of all possible stereoisomers of a given compound are present in the collection. The S/SAR viewer allows rapid identification of both structure/activity relationships and stereo-structure/activity relationships present in HTS data from the DOS collection. Together, these tools enable the prioritization and analysis of hits from diverse compound collections, and enable informed decisions for follow-up biology and chemistry efforts.  相似文献   

9.
Peptide research has increased during the last years due to their applications as biomarkers, therapeutic alternatives or as antigenic sub-units in vaccines. The implementation of computational resources have facilitated the identification of novel sequences, the prediction of properties, and the modelling of structures. However, there is still a lack of open source protocols that enable their straightforward analysis. Here, we present PepFun, a compilation of bioinformatics and cheminformatics functionalities that are easy to implement and customize for studying peptides at different levels: sequence, structure and their interactions with proteins. PepFun enables calculating multiple characteristics for massive sets of peptide sequences, and obtaining different structural observables derived from protein-peptide complexes. In addition, random or guided library design of peptide sequences can be customized for screening campaigns. The package has been created under the python language based on built-in functions and methods available in the open source projects BioPython and RDKit. We present two tutorials where we tested peptide binders of the MHC class II and the Granzyme B protease.  相似文献   

10.
High throughput screening (HTS) campaigns, where laboratory automation is used to expose biological targets to large numbers of materials from corporate compound collections, have become commonplace within the lead generation phase of pharmaceutical discovery. Advances in genomics and related fields have afforded a wealth of targets such that screening facilities at larger organizations routinely execute over 100 hit-finding campaigns per year. Often, 10(5) or 10(6) molecules will be tested within a campaign/cycle to locate a large number of actives requiring follow-up investigation. Due to resource constraints at every organization, traditional chemistry methods for validating hits and developing structure activity relationships (SAR) become untenable when challenged with hundreds of hits in multiple chemical families per target. To compound the issue, comparison and prioritization of hits versus multiple screens, or physical chemical property criteria, is made more complex by the informatics issues associated with handling large data sets. This article describes a collaborative research project designed to simultaneously leverage the medicinal chemistry and drug development expertise of the Novartis Institutes for Biomedical Research Inc. (NIBRI) and ArQule Inc.'s high throughput library design, synthesis and purification capabilities. The work processes developed by the team to efficiently design, prepare, purify, assess and prioritize multiple chemical classes that were identified during high throughput screening, cheminformatics and molecular modeling activities will be detailed.  相似文献   

11.
Computer-assisted chemical structure searching plays a critical role for efficient structure screening in cheminformatics. We designed a high-performance chemical structure & data search engine called DCAIKU, built on CouchDB and ElasticSearch engines. DCAIKU converts the chemical structure similarity search problem into a general text search problem to utilize off-the-shelf full-text search engines. DCAIKU also supports flexible document structures and heterogeneous datasets with the help of schema-less document database. Our evaluations show that DCAIKU can handle both keyword search and structural search against millions of records with both high accuracy and low latency. We expect that DCAIKU will lay the foundation towards large-scale and cost-effective structural search in materials science and chemistry research.  相似文献   

12.
Chemoinformatics: a new field with a long tradition   总被引:2,自引:0,他引:2  
Chemoinformatics is the application of informatics methods to solve chemical problems. Although this term was introduced only a few years ago, this field has a long history with its roots going back more than 40 years. Work on chemical structure representation and searching, quantitative structure–activity relationships, chemometrics, molecular modeling as well as computer-assisted structure elucidation and synthesis design was initiated in the 1960s. These different origins have now merged into a discipline of its own that is in full bloom. All areas of chemistry from analytical chemistry to drug design can benefit from chemoinformatics methods. And there are still many challenging chemical problems waiting for solutions through the further development of chemoinformatics.  相似文献   

13.
The “Cheminformatics aspects of high throughput screening (HTS): from robots to models” symposium was part of the computers in chemistry technical program at the American Chemical Society National Meeting in Denver, Colorado during the fall of 2011. This symposium brought together researchers from high throughput screening centers and molecular modelers from academia and industry to discuss the integration of currently available high throughput screening data and assays with computational analysis. The topics discussed at this symposium covered the data-infrastructure at various academic, hospital, and National Institutes of Health-funded high throughput screening centers, the cheminformatics and molecular modeling methods used in real world examples to guide screening and hit-finding, and how academic and non-profit organizations can benefit from current high throughput screening cheminformatics resources. Specifically, this article also covers the remarks and discussions in the open panel discussion of the symposium and summarizes the following talks on “Accurate Kinase virtual screening: biochemical, cellular and selectivity”, “Selective, privileged and promiscuous chemical patterns in high-throughput screening” and “Visualizing and exploring relationships among HTS hits using network graphs”.  相似文献   

14.
This paper proposes a method for molecular activity prediction in QSAR studies using ensembles of classifiers constructed by means of two supervised subspace projection methods, namely nonparametric discriminant analysis (NDA) and hybrid discriminant analysis (HDA). We studied the performance of the proposed ensembles compared to classical ensemble methods using four molecular datasets and eight different models for the representation of the molecular structure. Using several measures and statistical tests for classifier comparison, we observe that our proposal improves the classification results with respect to classical ensemble methods. Therefore, we show that ensembles constructed using supervised subspace projections offer an effective way of creating classifiers in cheminformatics.  相似文献   

15.
Biomass is an abundant source of chemically diverse macromolecules, including polysaccharides, polypeptides, and polyaromatics. Many of these biological polymers (biopolymers) are highly evolved for specific functions through optimized chain length, functionalization, and monomer sequence. As biopolymers are a chemical resource, much current effort is focused on the breakdown of these molecules into fuels or platform chemicals. However there is growing interest in using biopolymers directly to create functional materials. This Minireview uses recent examples to show how biopolymers are providing new directions in the synthesis of nanostructured materials.  相似文献   

16.
Even though NMR has found countless applications in the field of small molecule characterization, there is no standard file format available for the NMR data relevant to structure characterization of small molecules. A new format is therefore introduced to associate the NMR parameters extracted from 1D and 2D spectra of organic compounds to the proposed chemical structure. These NMR parameters, which we shall call NMReDATA (for nuclear magnetic resonance extracted data), include chemical shift values, signal integrals, intensities, multiplicities, scalar coupling constants, lists of 2D correlations, relaxation times, and diffusion rates. The file format is an extension of the existing Structure Data Format, which is compatible with the commonly used MOL format. The association of an NMReDATA file with the raw and spectral data from which it originates constitutes an NMR record. This format is easily readable by humans and computers and provides a simple and efficient way for disseminating results of structural chemistry investigations, allowing automatic verification of published results, and for assisting the constitution of highly needed open‐source structural databases.  相似文献   

17.
Selection of suitable solvent is essential for crystallization of pharmaceuticals. Based on chemical structures of 6397 compounds and 15 single solvents that were used to obtain their single crystals, correlations between the molecular characteristics and the solvents have been investigated by cheminformatics methods. Decision-tree and Bayesian-probability methods have been applied to make classification models. These two models are complementary in character in the present case. It has been proven that the prediction of the solvent rankings for particular compounds by use of the classification models is satisfactory from the practical point of view. The present study has demonstrated that cheminformatics methods could greatly help rational crystallization of small organic molecules such as pharmaceuticals.  相似文献   

18.
The sequencing of biopolymers such as proteins and DNA is among the most significant scientific achievements of the 20th century. Indeed, modern chemical methods for sequence analysis allow reading and understanding the codes of life. Thus, sequencing methods currently play a major role in applications as diverse as genomics, gene therapy, biotechnology, and data storage. However, in terms of fundamental science, sequencing is not really a question of molecular biology but rather a more general topic in macromolecular chemistry. Broadly speaking, it can be defined as the analysis of comonomer sequences in copolymers. However, relatively different approaches have been used in the past to study monomer sequences in biological and manmade polymers. Yet, these “cultural” differences are slowly fading away with the recent development of synthetic sequence‐controlled polymers. In this context, the aim of this Minireview is to present an overview of the tools that are currently available for sequence analysis in macromolecular science.  相似文献   

19.
For over a decade, cheminformatics has contributed to a wide array of scientific tasks from analytical chemistry and biochemistry to pharmacology and drug discovery; and although its contributions to decision making are recognized, the challenge is how it would contribute to faster development of novel, better products. Here we address the future of cheminformatics with primary focus on innovation. Cheminformatics developers often need to choose between “mainstream” (i.e., accepted, expected) and novel, leading-edge tools, with an increasing trend for open science. Possible futures for cheminformatics include the worst case scenario (lack of funding, no creative usage), as well as the best case scenario (complete integration, from systems biology to virtual physiology). As “-omics” technologies advance, and computer hardware improves, compounds will no longer be profiled at the molecular level, but also in terms of genetic and clinical effects. Among potentially novel tools, we anticipate machine learning models based on free text processing, an increased performance in environmental cheminformatics, significant decision-making support, as well as the emergence of robot scientists conducting automated drug discovery research. Furthermore, cheminformatics is anticipated to expand the frontiers of knowledge and evolve in an open-ended, extensible manner, allowing us to explore multiple research scenarios in order to avoid epistemological “local information minimum trap”.  相似文献   

20.
The topic of this article is the development and the present state of the art of computer chemistry, the computer-assisted solution of chemical problems. Initially the problems in computer chemistry were confined to structure elucidation on the basis of spectroscopic data, then programs for synthesis design based on libraries of reaction data for relatively narrow classes of target compounds were developed, and now computer programs for the solution of a great variety of chemical problems are available or are under development. Previously it was an achievement when any solution of a chemical problem could be generated by computer assistance. Today, the main task is the efficient, transparent, and non-arbitrary selection of meaningful results from the immense set of potential solutions—that also may contain innovative proposals. Chemistry has two aspects, constitutional chemistry and stereochemistry, which are interrelated, but still require different approaches. As a result, about twenty years ago, an algebraic model of the logical structure of chemistry was presented that consisted of two parts: the constitution-oriented algebra of be- and r-matrices, and the theory of the stereochemistry of the chemical identity group. New chemical definitions, concepts, and perspectives are characteristic of this logic-oriented model, as well as the direct mathematical representation of chemical processes. This model enables the implementation of formal reaction generators that can produce conceivable solutions to chemical problems—including unprecedented solutions—without detailed empirical chemical information. New formal selection procedures for computer-generated chemical information are also possible through the above model. It is expedient to combine these with interactive methods of selection. In this review, the Munich project is presented and discussed in detail. It encompasses the further development and implementation of the mathematical model of the logical structure of chemistry as well as the experimental verification of the computer-generated results. The article concludes with a review of new reactions, reagents, and reaction mechanisms that have been found with the PC-programs IGOR and RAIN.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号