首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

An activity cliff (AC) is formed by a pair of structurally similar compounds with a large difference in potency. Accordingly, ACs reveal structure–activity relationship (SAR) discontinuity and provide SAR information for compound optimization. Herein, we have investigated the question if ACs could be predicted from image data. Therefore, pairs of structural analogs were extracted from different compound activity classes that formed or did not form ACs. From these compound pairs, consistently formatted images were generated. Image sets were used to train and test convolutional neural network (CNN) models to systematically distinguish between ACs and non-ACs. The CNN models were found to predict ACs with overall high accuracy, as assessed using alternative performance measures, hence establishing proof-of-principle. Moreover, gradient weights from convolutional layers were mapped to test compounds and identified characteristic structural features that contributed to successful predictions. Weight-based feature visualization revealed the ability of CNN models to learn chemistry from images at a high level of resolution and aided in the interpretation of model decisions with intrinsic black box character.

  相似文献   

2.
3.
Inferring molecular structure from Nuclear Magnetic Resonance (NMR) measurements requires an accurate forward model that can predict chemical shifts from 3D structure. Current forward models are limited to specific molecules like proteins and state-of-the-art models are not differentiable. Thus they cannot be used with gradient methods like biased molecular dynamics. Here we use graph neural networks (GNNs) for NMR chemical shift prediction. Our GNN can model chemical shifts accurately and capture important phenomena like hydrogen bonding induced downfield shift between multiple proteins, secondary structure effects, and predict shifts of organic molecules. Previous empirical NMR models of protein NMR have relied on careful feature engineering with domain expertise. These GNNs are trained from data alone with no feature engineering yet are as accurate and can work on arbitrary molecular structures. The models are also efficient, able to compute one million chemical shifts in about 5 seconds. This work enables a new category of NMR models that have multiple interacting types of macromolecules.

This model can predict chemical shifts on proteins and small molecules purely from atom elements and coordinates. It can capture important phenomena like hydrogen bonding induced downfield shift, thus can be used to infer intermolecular interactions.  相似文献   

4.
5.
A multilayered feed-forward ANN architecture trained using the error-back-propagation (EBP) algorithm has been developed for predicting whether a given nucleotide sequence is a mycobacterial promoter sequence. Owing to the high prediction capability (97%) of the developed network model, it has been further used in conjunction with the caliper randomization (CR) approach for determining the structurally/functionally important regions in the promoter sequences. The results obtained thereby indicate that: (i) upstream region of −35 box, (ii) −35 region, (iii) spacer region and, (iv) −10 box, are important for mycobacterial promoters. The CR approach also suggests that the −38 to −29 region plays a significant role in determining whether a given sequence is a mycobacterial promoter. In essence, the present study establishes ANNs as a tool for predicting mycobacterial promoter sequences and determining structurally/functionally important sub-regions therein.  相似文献   

6.
In this article, an artificial neural network to predict the flash point of 95 esters was implemented. Four variables were used for its development. A neural network with 4‐5‐8‐5‐1 topology was encountered to gain the best agreement of the experimental results with those predicted (square correlation coefficient (R2) and root mean square error were 0.99 and 5.46 K for the training phase and 0.96 and 13.02 K for the testing set). © 2012 Wiley Periodicals, Inc.  相似文献   

7.
Aqueous/organic phase partition coefficients of organic acids were predicted using an artificial neural network (ANN) algorithm taking benzoic acid derivatives as examples. The partition coefficients were determined by extraction of the acids from aqueous salt solutions with hydrophilic solvents (BunOH, BuiOH, and ButOH). Using the ANN approach makes it possible to obtain quantitative information on the values of the title parameters. Published in Russian in Izvestiya Akademii Nauk. Seriya Khimicheskaya, No. 2, pp. 207—212, February, 2006.  相似文献   

8.
9.
A broad collection of technologies, including e.g. drug metabolism, biofuel combustion, photochemical decontamination of water, and interfacial passivation in energy production/storage systems rely on chemical processes that involve bond-breaking molecular reactions. In this context, a fundamental thermodynamic property of interest is the bond dissociation energy (BDE) which measures the strength of a chemical bond. Fast and accurate prediction of BDEs for arbitrary molecules would lay the groundwork for data-driven projections of complex reaction cascades and hence a deeper understanding of these critical chemical processes and, ultimately, how to reverse design them. In this paper, we propose a chemically inspired graph neural network machine learning model, BonDNet, for the rapid and accurate prediction of BDEs. BonDNet maps the difference between the molecular representations of the reactants and products to the reaction BDE. Because of the use of this difference representation and the introduction of global features, including molecular charge, it is the first machine learning model capable of predicting both homolytic and heterolytic BDEs for molecules of any charge. To test the model, we have constructed a dataset of both homolytic and heterolytic BDEs for neutral and charged (−1 and +1) molecules. BonDNet achieves a mean absolute error (MAE) of 0.022 eV for unseen test data, significantly below chemical accuracy (0.043 eV). Besides the ability to handle complex bond dissociation reactions that no previous model could consider, BonDNet distinguishes itself even in only predicting homolytic BDEs for neutral molecules; it achieves an MAE of 0.020 eV on the PubChem BDE dataset, a 20% improvement over the previous best performing model. We gain additional insight into the model''s predictions by analyzing the patterns in the features representing the molecules and the bond dissociation reactions, which are qualitatively consistent with chemical rules and intuition. BonDNet is just one application of our general approach to representing and learning chemical reactivity, and it could be easily extended to the prediction of other reaction properties in the future.

Prediction of bond dissociation energies for charged molecules with a graph neural network enabled by global molecular features and reaction difference features between products and reactants.  相似文献   

10.
Journal of Thermal Analysis and Calorimetry - This paper deals with the recognition of selected burning liquids by convolutional neural networks (CNNs). Three CNNs (AlexNet, GoogLeNet and...  相似文献   

11.
In several years, deep learning is a modern machine learning technique using in a variety of fields with state‐of‐the‐art performance. Therefore, utilization of deep learning to enhance performance is also an important solution for current bioinformatics field. In this study, we try to use deep learning via convolutional neural networks and position specific scoring matrices to identify electron transport proteins, which is an important molecular function in transmembrane proteins. Our deep learning method can approach a precise model for identifying of electron transport proteins with achieved sensitivity of 80.3%, specificity of 94.4%, and accuracy of 92.3%, with MCC of 0.71 for independent dataset. The proposed technique can serve as a powerful tool for identifying electron transport proteins and can help biologists understand the function of the electron transport proteins. Moreover, this study provides a basis for further research that can enrich a field of applying deep learning in bioinformatics. © 2017 Wiley Periodicals, Inc.  相似文献   

12.
This article provides a systematic study of several important parameters of the Associative Neural Network (ASNN), such as the number of networks in the ensemble, distance measures, neighbor functions, selection of smoothing parameters, and strategies for the user-training feature of the algorithm. The performance of the different methods is assessed with several training/test sets used to predict lipophilicity of chemical compounds. The Spearman rank-order correlation coefficient and Parzen-window regression methods provide the best performance of the algorithm. If additional user data is available, an improved prediction of lipophilicity of chemicals up to 2-5 times can be calculated when the appropriate smoothing parameters for the neural network are selected. The detected best combinations of parameters and strategies are implemented in the ALOGPS 2.1 program that is publicly available at http://www.vcclab.org/lab/alogps.  相似文献   

13.
A general purpose computational paradigm using neural networks is shown to be capable of efficiently predicting properties of polymeric compounds based on the structure and composition of the monomeric repeat unit. Results are discussed for the prediction of the heat capacity, glass transition temperature, melting temperature, change in the heat capacity at the glass transition temperature, degradation temperature, tensile strength and modulus, ultimate elongation, and compressive strength for 11 different families of polymers. The accuracies of the predictions range from 1–13% average absolute error. The worst results were obtained for the mechanical properties (tensile strength and modulus: 13%, 7% elongation: 12%, and compressive strength: 8%) and the best results for the thermal properties (heat capacity, glass transition temperature, and melting point: <4%). A simple modification to the overall method is devised to better take into account the fact that the mechanical properties are experimentally determined with a fairly large range (due to variability in measurement procedures and especially the sample). This modification treats the bounds on the range for the mechanical properties as complex numbers (complex, modular neural networks) and leads to more rapid optimization with a smaller average error (reduced by 3%).Dedicated to Professor Bernhard Wunderlich on the occasion of his 65th birthdayThis research was sponsored by the Division of Materials Sciences, Office of Basic Energy Sciences, U.S. Department of Energy, under Contract No. DE-AC05-84R21400 with Lockheed Martin Energy Systems, Inc. We would like to express our gratitude for the continued collaboration, support, and interest of Prof. Wunderlich in our research. We would also like to thank participants of the 1st DOE Workshop on Applications of Neural Networks in Materials Sciences for useful discussion on materials properties and neural networks.  相似文献   

14.
Predicting drug–target affinity (DTA) is beneficial for accelerating drug discovery. Graph neural networks (GNNs) have been widely used in DTA prediction. However, existing shallow GNNs are insufficient to capture the global structure of compounds. Besides, the interpretability of the graph-based DTA models highly relies on the graph attention mechanism, which can not reveal the global relationship between each atom of a molecule. In this study, we proposed a deep multiscale graph neural network based on chemical intuition for DTA prediction (MGraphDTA). We introduced a dense connection into the GNN and built a super-deep GNN with 27 graph convolutional layers to capture the local and global structure of the compound simultaneously. We also developed a novel visual explanation method, gradient-weighted affinity activation mapping (Grad-AAM), to analyze a deep learning model from the chemical perspective. We evaluated our approach using seven benchmark datasets and compared the proposed method to the state-of-the-art deep learning (DL) models. MGraphDTA outperforms other DL-based approaches significantly on various datasets. Moreover, we show that Grad-AAM creates explanations that are consistent with pharmacologists, which may help us gain chemical insights directly from data beyond human perception. These advantages demonstrate that the proposed method improves the generalization and interpretation capability of DTA prediction modeling.

MGraphDTA is designed to capture the local and global structure of a compound simultaneously for drug–target affinity prediction and can provide explanations that are consistent with pharmacologists.  相似文献   

15.
The hydrocarbon in-adamantane (1), a high-energy adamantane isomer in which one methine hydrogen atom is inside the cage, is predicted by ab initio calculations to be isolable at dry ice temperature. It has 440 kJ/mol of hydrogenic strain but appears to be stable against dimerization, moisture, and air. The inverted CH bond is compressed, and the IR and NMR spectra are unusual. The symmetrical pentadecafluoro derivative (2) has an estimated half-life of 100 years at room temperature.  相似文献   

16.
The rates of liquid-phase, acid-catalyzed reactions relevant to the upgrading of biomass into high-value chemicals are highly sensitive to solvent composition and identifying suitable solvent mixtures is theoretically and experimentally challenging. We show that the complex atomistic configurations of reactant–solvent environments generated by classical molecular dynamics simulations can be exploited by 3D convolutional neural networks to enable accurate predictions of Brønsted acid-catalyzed reaction rates for model biomass compounds. We develop a 3D convolutional neural network, which we call SolventNet, and train it to predict acid-catalyzed reaction rates using experimental reaction data and corresponding molecular dynamics simulation data for seven biomass-derived oxygenates in water–cosolvent mixtures. We show that SolventNet can predict reaction rates for additional reactants and solvent systems an order of magnitude faster than prior simulation methods. This combination of machine learning with molecular dynamics enables the rapid, high-throughput screening of solvent systems and identification of improved biomass conversion conditions.

Solvent-mediated, acid-catalyzed reaction rates relevant to the upgrading of biomass into high-value chemicals are accurately predicted using a combination of molecular dynamics simulations and 3D convolutional neural networks.  相似文献   

17.
Drug–drug interactions (DDIs) can trigger unexpected pharmacological effects on the body, and the causal mechanisms are often unknown. Graph neural networks (GNNs) have been developed to better understand DDIs. However, identifying key substructures that contribute most to the DDI prediction is a challenge for GNNs. In this study, we presented a substructure-aware graph neural network, a message passing neural network equipped with a novel substructure attention mechanism and a substructure–substructure interaction module (SSIM) for DDI prediction (SA-DDI). Specifically, the substructure attention was designed to capture size- and shape-adaptive substructures based on the chemical intuition that the sizes and shapes are often irregular for functional groups in molecules. DDIs are fundamentally caused by chemical substructure interactions. Thus, the SSIM was used to model the substructure–substructure interactions by highlighting important substructures while de-emphasizing the minor ones for DDI prediction. We evaluated our approach in two real-world datasets and compared the proposed method with the state-of-the-art DDI prediction models. The SA-DDI surpassed other approaches on the two datasets. Moreover, the visual interpretation results showed that the SA-DDI was sensitive to the structure information of drugs and was able to detect the key substructures for DDIs. These advantages demonstrated that the proposed method improved the generalization and interpretation capability of DDI prediction modeling.

SA-DDI is designed to learn size-adaptive molecular substructures for drug–drug interaction prediction and can provide explanations that are consistent with pharmacologists.  相似文献   

18.
Proteins are one of the most important molecules that govern the cellular processes in most of the living organisms. Various functions of the proteins are of paramount importance to understand the basics of life. Several supervised learning approaches are applied in this field to predict the functionality of proteins. In this paper, we propose a convolutional neural network based approach ProtConv to predict the functionality of proteins by converting the amino-acid sequences to a two dimensional image. We have used a protein embedding technique using transfer learning to generate the feature vector. Feature vector is then converted into a square sized single channel image to be fed into a convolutional network. The neural network architecture used here is a combination of convolutional filters and average pooling layers followed by dense fully connected layers to predict a binary function. We have performed experiments on standard benchmark datasets taken from two very important protein function prediction task: proinflammatory cytokines and anticancer peptides. Our experiments show that the proposed method, ProtConv achieves state-of-the-art performances on both of the datasets. All necessary details about implementation with source code and datasets are made available at: https://github.com/swakkhar/ProtConv.  相似文献   

19.
20.
As several structural proteomic projects are producing an increasing number of protein structures with unknown function, methods that can reliably predict protein functions from protein structures are in urgent need. In this paper, we present a method to explore the clustering patterns of amino acids on the 3-dimensional space for protein function prediction. First, amino acid residues on a protein structure are clustered into spatial groups using hierarchical agglomerative clustering, based on the distance between them. Second, the protein structure is represented using a graph, where each node denotes a cluster of amino acids. The nodes are labeled with an evolutionary profile derived from the multiple alignment of homologous sequences. Then, a shortest-path graph kernel is used to calculate similarities between the graphs. Finally, a support vector machine using this graph kernel is used to train classifiers for protein function prediction. We applied the proposed method to two separate problems, namely, prediction of enzymes and prediction of DNA-binding proteins. In both cases, the results showed that the proposed method outperformed other state-of-the-art methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号