首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
《印度化学会志》2023,100(1):100815
The right combination of surfactants and stabilizers in the detergent formulations plays a significant role in their cleaning performance. However, it becomes a complex optimization problem when the formulation is composed of multiple ingredients and the solution has to be optimized for competing performance metrics. In recent times, machine learning techniques have been used extensively to study such processes. In this research, a detergent pre-formulation has been designed using an aqueous solution of Tween-20, Ethanol and 1-Octanol. To determine the optimal values of the ingredients of the formulations, supervised machine learning models were developed and optimized for the Ross Miles Index 30 ml (RMI 30) and cleaning time (CT). A full factorial experimental design was performed and three regression models based on linear, 2FI and Quadratic designs were developed respectively for RMI30 and CT. ANOVA analysis of trained models reported an optimal p-value of 0.0018 for RMI 30 and less than 0.0001 for CT. The optimal values for RMI30 and CT obtained through regression models are 72.32 ml and 17.67 s. For multi-objective optimization, grey relational analysis was performed. Two pairs of optimal values corresponding to Rank 1 were recorded as 88.9 ml, 20 s (RMI30, CT); and 81.2 ml, 14 s (RMI30, CT) respectively. As a result, the optimal combination of Tween-20, Ethanol and 1-Octanol for maximizing the RMI30 and minimizing the CT are reported. The obtained optimal values were experimentally validated.  相似文献   

3.
We examine the kinetic model D. The differential equations describing this reaction scheme are cast in a nondimensional form and analyzed in four basic approximation regimes: a ‘pseudo-first order’ approximation valid for small values of the ratio of the initial concentrations of the reactants; an asymptotic solution valid for large values of k3; the standard steady state (Bodenstein) approximation; and an approximation to a second order system without intermediate. Interconnecting relationships between the various approximations derived are examined, and the approximations are compared to numerical solutions to the full equations. The results are assessed from the standpoint of the experimental kineticist, and it is suggested that the reaction studied, and consequently many other more complex reactions, may under certain circumstances be subject to non-unique interpretation.  相似文献   

4.
Predicting the stereochemical outcome of chemical reactions is challenging in mechanistically ambiguous transformations. The stereoselectivity of glycosylation reactions is influenced by at least eleven factors across four chemical participants and temperature. A random forest algorithm was trained using a highly reproducible, concise dataset to accurately predict the stereoselective outcome of glycosylations. The steric and electronic contributions of all chemical reagents and solvents were quantified by quantum mechanical calculations. The trained model accurately predicts stereoselectivities for unseen nucleophiles, electrophiles, acid catalyst, and solvents across a wide temperature range (overall root mean square error 6.8%). All predictions were validated experimentally on a standardized microreactor platform. The model helped to identify novel ways to control glycosylation stereoselectivity and accurately predicts previously unknown means of stereocontrol. By quantifying the degree of influence of each variable, we begin to gain a better general understanding of the transformation, for example that environmental factors influence the stereoselectivity of glycosylations more than the coupling partners in this area of chemical space.

A random forest algorithm, trained on a concise dataset and validated experimentally, accurately predicts the stereoselectivity of a complex organic coupling varying all reaction parameters as well as previously unknown mechanistic influences.  相似文献   

5.
In this paper, we study the classifications of unbalanced data sets of drugs. As an example we chose a data set of 2D6 inhibitors of cytochrome P450. The human cytochrome P450 2D6 isoform plays a key role in the metabolism of many drugs in the preclinical drug discovery process. We have collected a data set from annotated public data and calculated physicochemical properties with chemoinformatics methods. On top of this data, we have built classifiers based on machine learning methods. Data sets with different class distributions lead to the effect that conventional machine learning methods are biased toward the larger class. To overcome this problem and to obtain sensitive but also accurate classifiers we combine machine learning and feature selection methods with techniques addressing the problem of unbalanced classification, such as oversampling and threshold moving. We have used our own implementation of a support vector machine algorithm as well as the maximum entropy method. Our feature selection is based on the unsupervised McCabe method. The classification results from our test set are compared structurally with compounds from the training set. We show that the applied algorithms enable the effective high throughput in silico classification of potential drug candidates.  相似文献   

6.
Efficient target selection methods are an important prerequisite for increasing the success rate and reducing the cost of high-throughput structural genomics efforts. There is a high demand for sequence-based methods capable of predicting experimentally tractable proteins and filtering out potentially difficult targets at different stages of the structural genomic pipeline. Simple empirical rules based on anecdotal evidence are being increasingly superseded by rigorous machine-learning algorithms. Although the simplicity of less advanced methods makes them more human understandable, more sophisticated formalized algorithms possess superior classification power. The quickly growing corpus of experimental success and failure data gathered by structural genomics consortia creates a unique opportunity for retrospective data mining using machine learning techniques and results in increased quality of classifiers. For example, the current solubility prediction methods are reaching the accuracy of over 70%. Furthermore, automated feature selection leads to better insight into the nature of the correlation between amino acid sequence and experimental outcome. In this review we summarize methods for predicting experimental success in cloning, expression, soluble expression, purification and crystallization of proteins with a special focus on publicly available resources. We also describe experimental data repositories and machine learning techniques used for classification and feature selection.  相似文献   

7.
Photochemical reactions are widely used by academic and industrial researchers to construct complex molecular architectures via mechanisms that often require harsh reaction conditions. Photodynamics simulations provide time-resolved snapshots of molecular excited-state structures required to understand and predict reactivities and chemoselectivities. Molecular excited-states are often nearly degenerate and require computationally intensive multiconfigurational quantum mechanical methods, especially at conical intersections. Non-adiabatic molecular dynamics require thousands of these computations per trajectory, which limits simulations to ∼1 picosecond for most organic photochemical reactions. Westermayr et al. recently introduced a neural-network-based method to accelerate the predictions of electronic properties and pushed the simulation limit to 1 ns for the model system, methylenimmonium cation (CH2NH2+). We have adapted this methodology to develop the Python-based, Python Rapid Artificial Intelligence Ab Initio Molecular Dynamics (PyRAI2MD) software for the cistrans isomerization of trans-hexafluoro-2-butene and the 4π-electrocyclic ring-closing of a norbornyl hexacyclodiene. We performed a 10 ns simulation for trans-hexafluoro-2-butene in just 2 days. The same simulation would take approximately 58 years with traditional multiconfigurational photodynamics simulations. We generated training data by combining Wigner sampling, geometrical interpolations, and short-time quantum chemical trajectories to adaptively sample sparse data regions along reaction coordinates. The final data set of the cistrans isomerization and the 4π-electrocyclic ring-closing model has 6207 and 6267 data points, respectively. The training errors in energy using feedforward neural networks achieved chemical accuracy (0.023–0.032 eV). The neural network photodynamics simulations of trans-hexafluoro-2-butene agree with the quantum chemical calculations showing the formation of the cis-product and reactive carbene intermediate. The neural network trajectories of the norbornyl cyclohexadiene corroborate the low-yielding syn-product, which was absent in the quantum chemical trajectories, and revealed subsequent thermal reactions in 1 ns.

Photochemical reactions are widely used by academia and industry to construct complex molecular architectures via mechanisms that are often inaccessible by other means.  相似文献   

8.
Mass spectrometry imaging (MSI) is widely used for the label-free molecular mapping of biological samples. The identification of co-localized molecules in MSI data is crucial to the understanding of biochemical pathways. One of key challenges in molecular colocalization is that complex MSI data are too large for manual annotation but too small for training deep neural networks. Herein, we introduce a self-supervised clustering approach based on contrastive learning, which shows an excellent performance in clustering of MSI data. We train a deep convolutional neural network (CNN) using MSI data from a single experiment without manual annotations to effectively learn high-level spatial features from ion images and classify them based on molecular colocalizations. We demonstrate that contrastive learning generates ion image representations that form well-resolved clusters. Subsequent self-labeling is used to fine-tune both the CNN encoder and linear classifier based on confidently classified ion images. This new approach enables autonomous and high-throughput identification of co-localized species in MSI data, which will dramatically expand the application of spatial lipidomics, metabolomics, and proteomics in biological research.

Contrastive learning is used to train a deep convolutional neural network to identify high-level features in mass spectrometry imaging data. These features enable self-supervised clustering of ion images without manual annotation.  相似文献   

9.
Machine learning (ML) methods have been present in the field of NMR since decades, but it has experienced a tremendous growth in the last few years, especially thanks to the emergence of deep learning (DL) techniques taking advantage of the increased amounts of data and available computer power. These algorithms are successfully employed for classification, regression, clustering, or dimensionality reduction tasks of large data sets and have been intensively applied in different areas of NMR including metabonomics, clinical diagnosis, or relaxometry. In this article, we concentrate on the various applications of ML/DL in the areas of NMR signal processing and analysis of small molecules, including automatic structure verification and prediction of NMR observables in solution.  相似文献   

10.
Reactions that occur too rapidly to be monitored by rapid reaction methods at temperatures at or close to ambient can be investigated kinetically by retarding their reaction rates employing very low temperatures. A selection of reactions studied by this approach (low-temperature stopped-flow spectrophotometry) is reported. Details of the reaction mechanisms have been revealed for peroxide activation involving iron(III) porphyrins and cytochrome P450, superoxide activation involving manganese(II) complexes and iron porphyrin complexes, and dioxygen activation and binding by model mono-, and dinuclear copper(I) complexes and dioxygen activation at mono-, and dinuclear non-heme iron complexes. A final section covers progress in unravelling the mechanism of carbon–hydrogen bond activation by platinum complexes.  相似文献   

11.
We have performed a large‐scale evaluation of current computational methods, including conventional small‐molecule force fields; semiempirical, density functional, ab initio electronic structure methods; and current machine learning (ML) techniques to evaluate relative single‐point energies. Using up to 10 local minima geometries across ~700 molecules, each optimized by B3LYP‐D3BJ with single‐point DLPNO‐CCSD(T) triple‐zeta energies, we consider over 6500 single points to compare the correlation between different methods for both relative energies and ordered rankings of minima. We find that the current ML methods have potential and recommend methods at each tier of the accuracy‐time tradeoff, particularly the recent GFN2 semiempirical method, the B97‐3c density functional approximation, and RI‐MP2 for accurate conformer energies. The ANI family of ML methods shows promise, particularly the ANI‐1ccx variant trained in part on coupled‐cluster energies. Multiple methods suggest continued improvements should be expected in both performance and accuracy.  相似文献   

12.
与传统的非甾体类消炎药相比,选择性环氧化酶-2抑制剂具有无胃肠道粘膜损伤,溃疡和肾功能障碍等严重的副作用,设计选择性环氧化酶-2抑制剂具有重要意义。本文用支持矢量学习机和神经网络两种机器学习方法建立选择性环氧化酶-2抑制剂的活性预测模型,以期为选择性环氧化酶-2抑制剂药物的合成提供先导化合物。我们将467个环氧化酶-2抑制剂用Kennard-Stone方法分为训练集,验证集和独立测试集,对每一抑制剂分子我们计算了463个包含组成描述符和拓扑描述符的分子描述符来表征其分子结构,并通过F-Score方法选取最重要的分子描述符用于分类模型的建立。结果表明,SVM方法通过变量筛选后具有很好的预测能力,其预测正确率达到93.30%。  相似文献   

13.
Machine learning (ML) methods have great potential to transform chemical discovery by accelerating the exploration of chemical space and drawing scientific insights from data. However, modern chemical reaction ML models, such as those based on graph neural networks (GNNs), must be trained on a large amount of labelled data in order to avoid overfitting the data and thus possessing low accuracy and transferability. In this work, we propose a strategy to leverage unlabelled data to learn accurate ML models for small labelled chemical reaction data. We focus on an old and prominent problem—classifying reactions into distinct families—and build a GNN model for this task. We first pretrain the model on unlabelled reaction data using unsupervised contrastive learning and then fine-tune it on a small number of labelled reactions. The contrastive pretraining learns by making the representations of two augmented versions of a reaction similar to each other but distinct from other reactions. We propose chemically consistent reaction augmentation methods that protect the reaction center and find they are the key for the model to extract relevant information from unlabelled data to aid the reaction classification task. The transfer learned model outperforms a supervised model trained from scratch by a large margin. Further, it consistently performs better than models based on traditional rule-driven reaction fingerprints, which have long been the default choice for small datasets, as well as those based on reaction fingerprints derived from masked language modelling. In addition to reaction classification, the effectiveness of the strategy is tested on regression datasets; the learned GNN-based reaction fingerprints can also be used to navigate the chemical reaction space, which we demonstrate by querying for similar reactions. The strategy can be readily applied to other predictive reaction problems to uncover the power of unlabelled data for learning better models with a limited supply of labels.

Contrastive pretraining of chemical reactions by matching augmented reaction representations to improve machine learning performance on small reaction datasets.  相似文献   

14.
The rapid detection of microparticles exhibits a broad range of applications in the field of science and technology. The proposed method differentiates and identifies the 2 μm and 5 μm sized particles using a laser light scattering. The detection method is based on measuring forward light scattering from the particles and then classifying the acquired data using support vector machines. The device is composed of a microfluidic chip linked with photosensors and a laser device using optical fiber....  相似文献   

15.
The interpretation of two-dimensional gel electrophoresis (2-DGE) profiles can be facilitated by artificial intelligence and machine learning programs. We have incorporated into our 2-DGE computer analysis system (termed MELANIE-Medical Electrophoresis Analysis Interactive Expert system) a program which automatically classifies 2-DGE patterns using heuristic clustering analysis. This program is a step toward machine learning. In this publication, we describe the classification method and the preliminary results obtained with liver biopsy electrophoretograms. Heuristic clustering is also compared to other classification techniques.  相似文献   

16.
17.
In the nitroaldol reaction, condensation between a nitroalkane and an aldehyde yields a nitroalcohol that can undergo dehydration to yield a nitroalkene. Amine-functionalized, MCM-41-type mesoporous silica nanosphere (MSN) materials have been shown to selectively catalyze this reaction. Gas-phase reaction paths for the several competing mechanisms for the nitroaldol reaction have been mapped out using second-order perturbation theory (MP2). Improved relative energies were determined using singles and doubles coupled cluster theory with perturbative triples, CCSD(T). The mechanism in the absence of a catalyst was used to provide a baseline against which to assess the impact of the catalyst on both the mechanism and the related energetics. Catalyzed mechanisms can either pass through a nitroalcohol intermediate as in the classical mechanism or an imine intermediate.  相似文献   

18.
Membrane transporters are expressed in various bodily tissues and play essential roles in the homeostasis of endogenous substances and the absortion, distribution and/or excretion of xenobiotics. For transporter assays, radioisotope‐labeled compounds have been mainly used. However, commercially available radioisotope‐labeled compounds are limited in number and relatively expensive. Chromatographic analyses such as high‐performance liquid chromatography with ultraviolet absorptiometry and liquid chromatography with tandem mass spectrometry have also been applied for transport assays. To elucidate the transport properties of endogenous substrates, although there is no difficulty in performing assays using radioisotope‐labeled probes, the endogenous background and the metabolism of the compound after its translocation across cell membranes must be considered when the intact compound is assayed. In this review, the current state of knowledge about the transport of endogenous substrates via membrane transporters as determined by chromatographic techniques is summarized. Chromatographic techniques have contributed to our understanding of the transport of endogenous substances including amino acids, catecholamines, bile acids, prostanoids and uremic toxins via membrane transporters.  相似文献   

19.
Heteronuclear NMR spectroscopy provides a unique way to obtain site-specific information about protein-ligand interactions. Usually, such studies rely on the availability of isotopically labeled proteins, thereby allowing both editing of the spectra and ligand signals to be filtered out. Herein, we report that the use of the methyl SOFAST correlation experiment enables the determination of site-specific equilibrium binding constants by using unlabeled proteins. By using the binding of L- and D-tryptophan to serum albumin as a test case, we determined very accurate dissociation constants for both the high- and low-affinity sites present at the protein surface. The values of site-specific dissociation constants were closer to those obtained by isothermal titration calorimetry than those obtained from ligand-observed methods, such as saturation transfer difference. The possibility of measuring ligand binding to serum albumin at physiological concentrations with unlabeled proteins may open up new perspectives in the field of drug discovery.  相似文献   

20.
A programme, 'SIMKINERSQUO; is developed using a semi-implicit extrapolation method (SIEM) which uses the implicit midpoint rule and extrapolation to simulate complex mechanisms based on the kinetics of homogeneous chemical systems. The chemical kinetics pre-processor code is designed to translate a user-specified system of chemical rate equations into a system of chemical kinetic differential equations. The developed programme is applied to the 13-step mechanism of the reaction between Nile blue and acidic bromate. The results obtained compare well with the curves drawn using the other method, reported in literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号