首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recently, machine learning has emerged as an alternative, powerful approach for predicting quantum‐mechanical properties of molecules and solids. Here, using kernel ridge regression and atomic fingerprints representing local environments of atoms, we trained a machine‐learning model on a crystalline silicon system to directly predict the atomic forces at a wide range of temperatures. Our idea is to construct a machine‐learning model using a quantum‐mechanical dataset taken from canonical‐ensemble simulations at a higher temperature, or an upper bound of the temperature range. With our model, the force prediction errors were about 2% or smaller with respect to the corresponding force ranges, in the temperature region between 300 K and 1650 K. We also verified the applicability to a larger system, ensuring the transferability with respect to system size.  相似文献   

2.
The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes’ law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.  相似文献   

3.
4.
具有体积小、功耗低、灵敏度高、硅工艺兼容性好等优点的金属氧化物半导体(MOS)气体传感器现已广泛地应用于军事、科研和国民经济的各个领域。然而MOS传感器的低选择性阻碍了其在物联网(IoT)时代的应用前景。为此,本文综述了解决MOS传感器选择性的研究进展,主要介绍了敏感材料性能提升、电子鼻和热调制三种改善MOS传感器选择性的技术方法,阐述了三种方法目前所存在的问题及其未来的发展趋势。同时,本文还对比介绍了机器嗅觉领域主流的主成分分析(PCA)、线性判别分析(LDA)和神经网络(NN)模式识别/机器学习算法。最后,本综述展望了具有数据降维、特征提取和鲁棒性识别分类性能的卷积神经网络(CNN)深度学习算法在气体识别领域的应用前景。基于敏感材料性能的提升、多种调制手段与阵列技术的结合以及人工智能(AI)领域深度学习算法的最新进展,将会极大地增强非选择性MOS传感器的挥发性有机化合物(VOCs)分子识别能力。  相似文献   

5.
6.
7.
8.
分别以支持向量机(SVM)和KStar方法为基础, 构建了代谢产物的分子形状判别和代谢反应位点判别的嵌套预测模型. 分子形状判别模型是以272个分子为研究对象, 计算了包括分子拓扑、二维自相关、几何结构等在内的1280个分子描述符, 考查了支持向量机、决策树、贝叶斯网络、k最近邻这四种机器学习方法建立分类预测模型的准确性. 结果表明, 支持向量机优于其他方法, 此模型可用于预测分子能否被细胞色素P450酶催化发生氧脱烃反应. 代谢反应位点判别模型以538个氧脱烃反应代谢位点为研究对象, 计算了表征原子能量、价态、电荷等26个量子化学特征, 比较了决策树、贝叶斯网络、KStar、人工神经网络建模的准确率. 结果显示, KStar模型的准确率、敏感性、专一性均在90%以上, 对分子形状判别模型筛选出的分子, 此模型能较好地判断出哪个C―O键发生断裂. 本文以15个代谢反应明确的中药分子为验证集, 验证模型准确性, 研究结果表明基于SVM和KStar的嵌套预测模型具有一定的准确性, 有助于开展中药分子氧脱烃代谢产物的预测研究.  相似文献   

9.
《中国化学快报》2023,34(2):107514
From ZINC database with a total of 1.8 million small molecules, four compounds are identified as prolyl hydroxylase 2 inhibitors through a virtual screening workflow that sequentially incorporates machine learning, molecular docking, and molecular dynamics. Among them, compound 103, (E)-5-(5-((2-(1H-tetrazol-5-yl)hydrazineylidene)methyl)furan-2-yl)isoindoline-1,3-dione, promotes the migration and capillary tube formation capacity of human umbilical vein endothelial cells through enhancing the stability of hypoxia inducible factor-1α and increasing the level of vascular endothelial growth factor.  相似文献   

10.
Modern functional materials consist of large molecular building blocks with significant chemical complexity which limits spectroscopic property prediction with accurate first-principles methods. Consequently, a targeted design of materials with tailored optoelectronic properties by high-throughput screening is bound to fail without efficient methods to predict molecular excited-state properties across chemical space. In this work, we present a deep neural network that predicts charged quasiparticle excitations for large and complex organic molecules with a rich elemental diversity and a size well out of reach of accurate many body perturbation theory calculations. The model exploits the fundamental underlying physics of molecular resonances as eigenvalues of a latent Hamiltonian matrix and is thus able to accurately describe multiple resonances simultaneously. The performance of this model is demonstrated for a range of organic molecules across chemical composition space and configuration space. We further showcase the model capabilities by predicting photoemission spectra at the level of the GW approximation for previously unseen conjugated molecules.

A physically-inspired machine learning model for orbital energies is developed that can be augmented with delta learning to obtain photoemission spectra, ionization potentials, and electron affinities with experimental accuracy.  相似文献   

11.
Radical C?H bond functionalization provides a versatile approach for elaborating heterocyclic compounds. The synthetic design of this transformation relies heavily on the knowledge of regioselectivity, while a quantified and efficient regioselectivity prediction approach is still elusive. Herein, we report the feasibility of using a machine learning model to predict the transition state barrier from the computed properties of isolated reactants. This enables rapid and reliable regioselectivity prediction for radical C?H bond functionalization of heterocycles. The Random Forest model with physical organic features achieved 94.2 % site accuracy and 89.9 % selectivity accuracy in the out‐of‐sample test set. The prediction performance was further validated by comparing the machine learning results with additional substituents, heteroarene scaffolds and experimental observations. This work revealed that the combination of mechanism‐based computational statistics and machine learning model can serve as a useful strategy for selectivity prediction of organic transformations.  相似文献   

12.
Many chemoinformatics applications, including high-throughput virtual screening, benefit from being able to rapidly predict the physical, chemical, and biological properties of small molecules to screen large repositories and identify suitable candidates. When training sets are available, machine learning methods provide an effective alternative to ab initio methods for these predictions. Here, we leverage rich molecular representations including 1D SMILES strings, 2D graphs of bonds, and 3D coordinates to derive efficient machine learning kernels to address regression problems. We further expand the library of available spectral kernels for small molecules developed for classification problems to include 2.5D surface and 3D kernels using Delaunay tetrahedrization and other techniques from computational geometry, 3D pharmacophore kernels, and 3.5D or 4D kernels capable of taking into account multiple molecular configurations, such as conformers. The kernels are comprehensively tested using cross-validation and redundancy-reduction methods on regression problems using several available data sets to predict boiling points, melting points, aqueous solubility, octanol/water partition coefficients, and biological activity with state-of-the art results. When sufficient training data are available, 2D spectral kernels in general tend to yield the best and most robust results, better than state-of-the art. On data sets containing thousands of molecules, the kernels achieve a squared correlation coefficient of 0.91 for aqueous solubility prediction and 0.94 for octanol/water partition coefficient prediction. Averaging over conformations improves the performance of kernels based on the three-dimensional structure of molecules, especially on challenging data sets. Kernel predictors for aqueous solubility (kSOL), LogP (kLOGP), and melting point (kMELT) are available over the Web through: http://cdb.ics.uci.edu.  相似文献   

13.
Radical C−H bond functionalization provides a versatile approach for elaborating heterocyclic compounds. The synthetic design of this transformation relies heavily on the knowledge of regioselectivity, while a quantified and efficient regioselectivity prediction approach is still elusive. Herein, we report the feasibility of using a machine learning model to predict the transition state barrier from the computed properties of isolated reactants. This enables rapid and reliable regioselectivity prediction for radical C−H bond functionalization of heterocycles. The Random Forest model with physical organic features achieved 94.2 % site accuracy and 89.9 % selectivity accuracy in the out-of-sample test set. The prediction performance was further validated by comparing the machine learning results with additional substituents, heteroarene scaffolds and experimental observations. This work revealed that the combination of mechanism-based computational statistics and machine learning model can serve as a useful strategy for selectivity prediction of organic transformations.  相似文献   

14.
Molecular “fingerprints” encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.  相似文献   

15.
陈乐添  张旭  陈安  姚赛  胡绪  周震 《催化学报》2022,43(1):11-32
随着能源需求增长与化石燃料资源枯竭之间的矛盾日益突出,以及石油、天然气等不可再生资源的燃烧带来的环境问题和全球变暖,清洁可再生能源越来越受到人们的重视.因此,包括能源转换和可逆能源使用等的可持续发展技术受到广泛关注.其中,电催化被认为是清洁能源转化的重要方法.目前,电催化反应的催化剂仍以贵金属为主.但贵金属昂贵的价格极...  相似文献   

16.
A number of properties of liquids with polar molecules are considered using water and ammonia as examples in the discrete approach, i.e., placing the centers of molecules on nodes of a particular lattice while assuming different orientations of the molecules. Different experimental properties are explained by different assumptions concerning the average values of pair orientations of adjacent molecules. Rather than using a uniform thermodynamic distribution over orientations to study fast processes related to the formation and migration of solvated electrons, a quasi-chemical approximation of the lattice gas model is used to calculate the thermodynamic properties of a liquid.  相似文献   

17.
Small molecule targeting of RNA has emerged as a new frontier in medicinal chemistry, but compared to the protein targeting literature our understanding of chemical matter that binds to RNA is limited. In this study, we reported R epository O f BI nders to N ucleic acids (ROBIN), a new library of nucleic acid binders identified by small molecule microarray (SMM) screening. The complete results of 36 individual nucleic acid SMM screens against a library of 24 572 small molecules were reported (including a total of 1 627 072 interactions assayed). A set of 2 003 RNA-binding small molecules was identified, representing the largest fully public, experimentally derived library of its kind to date. Machine learning was used to develop highly predictive and interpretable models to characterize RNA-binding molecules. This work demonstrates that machine learning algorithms applied to experimentally derived sets of RNA binders are a powerful method to inform RNA-targeted chemical space.  相似文献   

18.
19.
In the past few years, there has been considerable activity in both academic and industrial research to develop innovative machine learning approaches to locate novel, high-performing molecules in chemical space. Here we describe a new and fundamentally different type of approach that provides a holistic overview of how high-performing molecules are distributed throughout a search space. Based on an open-source, graph-based implementation [J. H. Jensen, Chem. Sci., 2019, 10, 3567–3572] of a traditional genetic algorithm for molecular optimisation, and influenced by state-of-the-art concepts from soft robot design [J. B. Mouret and J. Clune, Proceedings of the Artificial Life Conference, 2012, pp. 593–594], we provide an algorithm that (i) produces a large diversity of high-performing, yet qualitatively different molecules, (ii) illuminates the distribution of optimal solutions, and (iii) improves search efficiency compared to both machine learning and traditional genetic algorithm approaches.

We report a novel algorithm that produces a large diversity of high-performing molecules, illuminates the distribution of optimal solutions, and improves search efficiency compared to both machine learning and genetic algorithm approaches.  相似文献   

20.
Hundreds of catalytic methods are developed each year to meet the demand for high-purity chiral compounds. The computational design of enantioselective organocatalysts remains a significant challenge, as catalysts are typically discovered through experimental screening. Recent advances in combining quantum chemical computations and machine learning (ML) hold great potential to propel the next leap forward in asymmetric catalysis. Within the context of quantum chemical machine learning (QML, or atomistic ML), the ML representations used to encode the three-dimensional structure of molecules and evaluate their similarity cannot easily capture the subtle energy differences that govern enantioselectivity. Here, we present a general strategy for improving molecular representations within an atomistic machine learning model to predict the DFT-computed enantiomeric excess of asymmetric propargylation organocatalysts solely from the structure of catalytic cycle intermediates. Mean absolute errors as low as 0.25 kcal mol−1 were achieved in predictions of the activation energy with respect to DFT computations. By virtue of its design, this strategy is generalisable to other ML models, to experimental data and to any catalytic asymmetric reaction, enabling the rapid screening of structurally diverse organocatalysts from available structural information.

A machine learning model for enantioselectivity prediction using reaction-based molecular representations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号