首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Collecting, organizing, and reviewing chemical information associated with screening hits are human time-consuming. The task depends highly on the individual, and human errors may result in missing leads or wasting resources. To overcome these hurdles, we have developed a decision support system, Hits Analysis Database (HAD). HAD is a software tool that automatically generates an ISIS database file containing compound structures, biological activities, calculated properties such as clogP, hazard fragment labels, structure classifications, etc. All data are processed by available software and packed into a single SD file. In addition to search capabilities, HAD provides an overview of structural classes and associated activity statistics. Chemical structures can be organized by maximum common substructure clustering. The ease of use and customized features make HAD a chief tool in lead selection processes.  相似文献   

3.
4.
This paper describes ArQiologist, a Web-based tool that integrates chemical, analytical, biological, and computational data to facilitate decision support for lead optimization at ArQule. It features an easy-to-use graphical query builder that allows queries to be saved, reused, and shared by researchers. Query results can be viewed with built-in data browsers or exported with structures to external applications such as Microsoft Excel or Spotfire for further analysis.  相似文献   

5.
The scientific literature is important source of experimental and chemical structure data. Very often this data has been harvested into smaller or bigger data collections leaving the data quality and curation issues on shoulders of users. The current research presents a systematic and reproducible workflow for collecting series of data points from scientific literature and assembling a database that is suitable for the purposes of high quality modelling and decision support. The quality assurance aspect of the workflow is concerned with the curation of both chemical structures and associated toxicity values at (1) single data point level and (2) collection of data points level. The assembly of a database employs a novel “timeline” approach. The workflow is implemented as a software solution and its applicability is demonstrated on the example of the Tetrahymena pyriformis acute aquatic toxicity endpoint. A literature collection of 86 primary publications for T. pyriformis was found to contain 2,072 chemical compounds and 2,498 unique toxicity values, which divide into 2,440 numerical and 58 textual values. Every chemical compound was assigned to a preferred toxicity value. Examples for most common chemical and toxicological data curation scenarios are discussed.  相似文献   

6.
7.
The chromosome aberration test is frequently used for the assessment of the potential of chemicals and drugs to elicit genetic damage in mammalian cells in vitro. Due to the limitations of experimental genotoxicity testing in early drug discovery phases, a model to predict the chromosome aberration test yielding high accuracy and providing guidance for structure optimization is urgently needed. In this paper, we describe a machine learning approach for predicting the outcome of this assay based on the structure of the investigated compound. The novelty of the proposed method consists in combining a maximum common subgraph kernel for measuring the similarity of two chemical graphs with the potential support vector machine for classification. In contrast to standard support vector machine classifiers, the proposed approach does not provide a black box model but rather allows to visualize structural elements with high positive or negative contribution to the class decision. In order to compare the performance of different methods for predicting the outcome of the chromosome aberration test, we compiled a large data set exhibiting high quality, reliability, and consistency from public sources and configured a fixed cross-validation protocol, which we make publicly available. In a comparison to standard methods currently used in pharmaceutical industry as well as to other graph kernel approaches, the proposed method achieved significantly better performance.  相似文献   

8.
9.
We describe a method for modeling chemical mutagenicity in terms of simple rules based on molecular features. A classification model was built using a rule-based ensemble method called RuleFit, developed by Friedman and Popescu. We show how performance compares favorably against literature methods. Performance was measured through the use of cross-validation and testing on external test sets. All data sets used are publicly available. The method automatically generated transparent rules in terms of molecular structure that agree well with known toxicology. While we have focused on chemical mutagenicity in demonstrating this method, we anticipate that it may be more generally useful in modeling other molecular properties such as other types of chemical toxicity.  相似文献   

10.
11.
12.
13.
14.
15.
The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes’ law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.  相似文献   

16.
Hazard assessments of chemicals have been limited by the availability of test data and the time needed to evaluate the test data. While available data may be inadequate for the majority of industrial chemicals, the body of existing knowledge for most hazards is large enough to permit reliable estimates to be made for untested chemicals without additional animal testing. We provide a summary of the growing use by regulatory agencies of the chemical categories approach, which groups chemicals based on their similar toxicological behaviour and fills in the data gaps in animal test data such as genotoxicity and aquatic toxicity. Although the categories approach may be distinguished from the use of quantitative structure–activity relationships (QSARs) for specific hazard endpoints, robust chemical categories are founded on quantifying the chemical structure with parameters that control chemical behaviour in conventional hazard assessment. The dissemination of the QSAR Application Toolbox by the Organisation for Economic Cooperation and Development (OECD) is an effort to facilitate the use of the categories approach and reduce the need for additional animal testing.  相似文献   

17.
The article presents a simple and general methodology, especially destined to the optimization of complex, strongly nonlinear systems, for which no extensive knowledge or precise models are available. The optimization problem is solved by means of a simple genetic algorithm, and the results are interpreted both from the mathematical point of view (the minimization of the objective function) and technological (the estimation of the achievement of individual objectives in multiobjective optimization). The use of a scalar objective function is supported by the fact that the genetic algorithm also computes the weights attached to the individual objectives along with the optimal values of the decision variables. The optimization strategy is accomplished in three stages: (1) the design and training of the neural model by a new method based on a genetic algorithm where information about the network is coded into the chromosomes; (2) the actual optimization based on genetic algorithms, which implies testing different values for parameters and different variants of the algorithm, computing the weights of the individual objectives and determining the optimal values for the decision variables; (3) the user's decision, who chooses a solution based on technological criteria. © 2007 Wiley Periodicals, Inc. Int J Quantum Chem, 2008  相似文献   

18.
In 2001, the European Commission published a policy statement ("White Paper") on future chemicals regulation and risk reduction that proposed the use of non-animal test systems and tailor-made testing approaches, including (Q)SARs, to reduce financial costs and the number of test animals employed. The authors have compiled a database containing data submitted within the EU chemicals notification procedure. From these data, (Q)SARs for the prediction of local irritation/corrosion and/or sensitisation potential were developed and published. These (Q)SARs, together with an expert system supporting their use, will be submitted for official validation and application within regulatory hazard assessment strategies. The main features are: two sets of structural alerts for the prediction of skin sensitisation hazard classification as defined by the European risk phrase R43, comprising 15 rules for chemical substructures deemed to be sensitising by direct action with cells or proteins, and three rules for substructures acting indirectly, i.e., requiring biochemical transformation; a decision support system (DSS) for the prediction of skin and/or eye lesion potential built from information extracted from our database. This DSS combines SARs defining reactive chemical substructures relevant for local lesions to be classified, and QSARs for the prediction of the absence of such a potential. The role of the BfR database, and (Q)SARs derived from it, in the use of current and future (EU) testing strategies for irritation and sensitisation is discussed.  相似文献   

19.
The goal of computational protein structure prediction is to provide three-dimensional (3D) structures with resolution comparable to experimental results. Comparative modeling, which predicts the 3D structure of a protein based on its sequence similarity to homologous structures, is the most accurate computational method for structure prediction. In the last two decades, significant progress has been made on comparative modeling methods. Using the large number of protein structures deposited in the Protein Data Bank (~65,000), automatic prediction pipelines are generating a tremendous number of models (~1.9 million) for sequences whose structures have not been experimentally determined. Accurate models are suitable for a wide range of applications, such as prediction of protein binding sites, prediction of the effect of protein mutations, and structure-guided virtual screening. In particular, comparative modeling has enabled structure-based drug design against protein targets with unknown structures. In this review, we describe the theoretical basis of comparative modeling, the available automatic methods and databases, and the algorithms to evaluate the accuracy of predicted structures. Finally, we discuss relevant applications in the prediction of important drug target proteins, focusing on the G protein-coupled receptor (GPCR) and protein kinase families.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号