首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Adsorption process was simulated in this study for removal of Hg and Ni from water using nanocomposite materials. The used nanostructured material for the adsorption study was a combined MOF and layered double hydroxide, which is considered as MOF-LDH in this work. The data were obtained from resources and different machine learning models were trained. We selected three different regression models, including elastic net, decision tree, and Gradient boosting, to make regression on the small data set with two inputs and two outputs. Inputs are Ion type (Hg or Ni) and initial ion concentration in the feed solution (C0), and outputs are equilibrium concentration (Ce) and equilibrium capacity of the adsorbent (Qe) in this dataset. After tuning their hyper-parameters, final models were implemented and assessed using different metrics. In terms of the R2-score metric, all models have more than 0.97 for Ce and more than 0.88 for Qe. The Gradient Boosting has an R2-score of 0.994 for Qe. Also, considering RMSE and MAE, Gradient Boosting shows acceptable errors and best models. Finally, the optimal values with the GB model are identical to dataset optimal: (Ion = Ni, C0 = 250, Ce = 206.0). However, for Qe, it is different and is equal to (Ion = Hg, C0 = 121.12, Ce = 606.15). The results revealed that the developed methods of simulation are of high capacity in prediction of adsorption for removal of heavy metals using nanostructure materials.  相似文献   

2.
Organic light-emitting diode (OLED) materials have exhibited a wide range of applications. However, the further development and commercialization of OLEDs requires higher quality OLED materials, including materials with a high thermal stability. Thermal stability is associated with the glass transition temperature (Tg) and decomposition temperature (Td), but experimental determinations of these two important properties generally involve a time-consuming and laborious process. Thus, the development of a quick and accurate prediction tool is highly desirable. Motivated by the challenge, we explored machine learning (ML) by constructing a new dataset with more than 1,000 samples collected from a wide range of literature, through which ensemble learning models were explored. Models trained with the LightGBM algorithm exhibited the best prediction performance, where the values of mean absolute error, root mean squared error, and R2 were 17.15 K, 24.63 K, and 0.77 for Tg prediction and 24.91 K, 33.88 K, and 0.78 for Td prediction. The prediction performance and the generalization of the ML models were further tested by two applications, which also exhibited satisfactory results. Experimental validation further demonstrated the reliability and the practical potential of the ML-based models. In order to extend the practical application of the ML-based models, an online prediction platform was constructed. This platform includes the optimal prediction models and all the thermal stability data under study, and it is freely available at http://www.oledtppxmpugroup.com. We expect that this platform will become a useful tool for experimental investigation of Tg and Td, accelerating the design of OLED materials with desired properties.  相似文献   

3.
The protein disulfide bond is a covalent bond that forms during post-translational modification by the oxidation of a pair of cysteines. In protein, the disulfide bond is the most frequent covalent link between amino acids after the peptide bond. It plays a significant role in three-dimensional (3D) ab initio protein structure prediction (aiPSP), stabilizing protein conformation, post-translational modification, and protein folding. In aiPSP, the location of disulfide bonds can strongly reduce the conformational space searching by imposing geometrical constraints. Existing experimental techniques for the determination of disulfide bonds are time-consuming and expensive. Thus, developing sequence-based computational methods for disulfide bond prediction becomes indispensable. This study proposed a stacking-based machine learning approach for disulfide bond prediction (diSBPred). Various useful sequence and structure-based features are extracted for effective training, including conservation profile, residue solvent accessibility, torsion angle flexibility, disorder probability, a sequential distance between cysteines, and more. The prediction of disulfide bonds is carried out in two stages: first, individual cysteines are predicted as either bonding or non-bonding; second, the cysteine-pairs are predicted as either bonding or non-bonding by including the results from cysteine bonding prediction as a feature.The examination of the relevance of the features employed in this study and the features utilized in the existing nearest neighbor algorithm (NNA) method shows that the features used in this study improve about 7.39 % in jackknife validation balanced accuracy. Moreover, for individual cysteine bonding prediction and cysteine-pair bonding prediction, diSBPred provides a 10-fold cross-validation balanced accuracy of 82.29 % and 94.20 %, respectively. Altogether, our predictor achieves an improvement of 43.25 % based on balanced accuracy compared to the existing NNA based approach. Thus, diSBPred can be utilized to annotate the cysteine bonding residues of protein sequences whose structures are unknown as well as improve the accuracy of the aiPSP method, which can further aid in experimental studies of the disulfide bond and structure determination.  相似文献   

4.
5.
陈乐添  张旭  陈安  姚赛  胡绪  周震 《催化学报》2022,43(1):11-32
随着能源需求增长与化石燃料资源枯竭之间的矛盾日益突出,以及石油、天然气等不可再生资源的燃烧带来的环境问题和全球变暖,清洁可再生能源越来越受到人们的重视.因此,包括能源转换和可逆能源使用等的可持续发展技术受到广泛关注.其中,电催化被认为是清洁能源转化的重要方法.目前,电催化反应的催化剂仍以贵金属为主.但贵金属昂贵的价格极...  相似文献   

6.
In this work, we developed artificial intelligence-based models for prediction and correlation of CO2 solubility in amino acid solutions for the purpose of CO2 capture. The models were used to correlate the process parameters to the CO2 loading in the solvent. Indeed, CO2 loading/solubility in the solvent was considered as the sole model’s output. The studied solvent in this work were potassium and sodium-based amino acid salt solutions. For the predictions, we tried three potential models, including Multi-layer Perceptron (MLP), Decision Tree (DT), and AdaBoost-DT. In order to discover the ideal hyperparameters for each model, we ran the method multiple times to find out the best model. R2 scores for all three models exceeded 0.9 after optimization confirming the great prediction capabilities for all models. AdaBoost-DT indicated the highest R2 Score of 0.998. With an R2 of 0.98, Decision Tree was the second most accurate one, followed by MLP with an R2 of 0.9.  相似文献   

7.
This study unites six popular machine learning approaches to enhance the prediction of a molecular binding affinity between receptors (large protein molecules) and ligands (small organic molecules). Here we examine a scheme where affinity of ligands is predicted against a single receptor – human thrombin, thus, the models consider ligand features only. However, the suggested approach can be repurposed for other receptors. The methods include Support Vector Machine, Random Forest, CatBoost, feed-forward neural network, graph neural network, and Bidirectional Encoder Representations from Transformers. The first five methods use input features based on physico-chemical properties of molecules, while the last one is based on textual molecular representations. All approaches do not rely on atomic spatial coordinates, avoiding a potential bias from known structures, and are capable of generalizing for compounds with unknown conformations. Within each of the methods, we have trained two models that solve classification and regression tasks. Then, all models are grouped into a pipeline of two subsequent ensembles. The first ensemble aggregates six classification models which vote whether a ligand binds to a receptor or not. If a ligand is classified as active (i.e., binds), the second ensemble predicts its binding affinity in terms of the inhibition constant Ki.  相似文献   

8.
9.
ANFIS (Adaptive neuro fuzzy inference system) modeling of CO2 capture using chemical absorbent was carried out in this study to correlate the solubility of CO2 to the solvent and operational parameters. In the ANFIS model, the input parameters including temperature, pressure, and physio-chemical properties of the solvent were considered, while the loading of CO2 in the absorbent was considered as the sole target output to be predicted by the model. Indeed, we developed a machine learning based model for predicting the CO2 loading capacity in amino acid salt solutions as the chemical absorbent of carbon dioxide. This model uses a metaheuristic optimized ANFIS based on a wide range of amino acids. This study's novel part is the use of Differential Evolution (DE) and Firefly Algorithm (FA) metaheuristics in order to solve hyper-parameter tuning of ANFIS as an optimization problem based on differential evolution. Accordingly, the optimized ANFIS model has an R2 score of 0.9520 for the test data and a score of 0.9841 for the training data. This indicates that the proposed model is both general and accurate in terms of its predictions for CO2 loading in amino acid salt solutions. The MAPE and RMSE error rates are also 1.17E-01, respectively, while the MAPE error rate is 1.14E-01.  相似文献   

10.
An artificial intelligence-based predictive model was developed using a support vector machine to investigate the solubility data of the drug Busulfan drug in supercritical carbon dioxide. The data for simulations were collected from literature. The model was trained and implemented in order to determine the correlation between the solubility values and the input parameters, namely, temperature and pressure. These parameters were used as the inputs as they are known to have a significant effect on the solubility of Busulfan in supercritical carbon dioxide. In the artificial intelligence model, a polynomial model with kernel function was applied to the data, and the model’s findings were compared with measured data for fitting. Good agreement was observed between the model’s outputs and the measured data with coefficient of determination greater than 0.99.  相似文献   

11.
In this work, a systematic method to support the building of bioprocess models through the use of different optimization techniques is presented. The method was applied to a tower bioreactor for bioethanol production with immobilized cells of Saccharomyces cerevisiae. Specifically, a step-by-step procedure to the estimation problem is proposed. As the first step, the potential of global searching of real-coded genetic algorithm (RGA) was applied for simultaneous estimation of the parameters. Subsequently, the most significant parameters were identified using the Placket–Burman (PB) design. Finally, the quasi-Newton algorithm (QN) was used for optimization of the most significant parameters, near the global optimum region, as the initial values were already determined by the RGA global-searching algorithm. The results have shown that the performance of the estimation procedure applied in a deterministic detailed model to describe the experimental data is improved using the proposed method (RGA–PB–QN) in comparison with a model whose parameters were only optimized by RGA.  相似文献   

12.
化学科学领域的复杂性和海量数据为人工智能应用提供了契机。人工智能、机器学习、深度学习从海量数据中识别新的化合物,建立新的模型,提出新的理论,正在改变化学物质的发现、转化和功能研究范式,促进重大问题的解决。本文综述了近年来国际上人工智能在化学研究中的重要进展,分析了人工智能化学的主要发展态势。人工智能通过助力化学海量数据挖掘、实现化学实验室智能化和自动化、增强计算化学解决实际问题的能力,推动化学跨越式发展。  相似文献   

13.
With the application of machine learning to large-material data sets, models are being developed that allow us to better predict novel materials with designed properties. Advances in artificial intelligence and its subclasses, as well as compute infrastructure, are making it possible to rapidly compute material properties, to access time/length scales and chemical spaces beyond the current capabilities of density functional theory and to outperform humans in interpretation and characterization of the data. This review highlights the latest developments in the field with special interest to energy storage materials.  相似文献   

14.
Multi-instance multi-label (MIML) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with not only multiple instances but also multiple class labels. To find an appropriate MIML learning method for genome-wide protein function prediction, many studies in the literature attempted to optimize objective functions in which dissimilarity between instances is measured using the Euclidean distance. But in many real applications, Euclidean distance may be unable to capture the intrinsic similarity/dissimilarity in feature space and label space. Unlike other previous approaches, in this paper, we propose to learn a multi-instance multi-label distance metric learning framework (MIMLDML) for genome-wide protein function prediction. Specifically, we learn a Mahalanobis distance to preserve and utilize the intrinsic geometric information of both feature space and label space for MIML learning. In addition, we try to deal with the sparsely labeled data by giving weight to the labeled data. Extensive experiments on seven real-world organisms covering the biological three-domain system (i.e., archaea, bacteria, and eukaryote; Woese et al., 1990) show that the MIMLDML algorithm is superior to most state-of-the-art MIML learning algorithms.  相似文献   

15.
A DNA microarray can track the expression levels of thousands of genes simultaneously. Previous research has demonstrated that this technology can be useful in the classification of cancers. Cancer microarray data normally contains a small number of samples which have a large number of gene expression levels as features. To select relevant genes involved in different types of cancer remains a challenge. In order to extract useful gene information from cancer microarray data and reduce dimensionality, feature selection algorithms were systematically investigated in this study. Using a correlation-based feature selector combined with machine learning algorithms such as decision trees, nave Bayes and support vector machines, we show that classification performance at least as good as published results can be obtained on acute leukemia and diffuse large B-cell lymphoma microarray data sets. We also demonstrate that a combined use of different classification and feature selection approaches makes it possible to select relevant genes with high confidence. This is also the first paper which discusses both computational and biological evidence for the involvement of zyxin in leukaemogenesis.  相似文献   

16.
Nowadays, cancer is considered a global pandemic and millions of people die every year because this disease remains a challenge for the world scientific community. Even with the efforts made to combat it, there is a growing need to discover and design new drugs and vaccines. Among these alternatives, antitumor peptides are a promising therapeutic solution to reduce the incidence of deaths caused by cancer. In the present study, we developed TTAgP, an accurate bioinformatic tool that uses the random forest algorithm for antitumor peptide predictions, which are presented in the context of MHC class I. The predictive model of TTAgP was trained and validated based on several features of 922 peptides. During the model validation we achieved sensitivity = 0.89, specificity = 0.92, accuracy = 0.90 and the Matthews correlation coefficient = 0.79 performance measures, which are indicative of a robust model. TTAgP is a fast, accurate and intuitive software focused on the prediction of tumor T cell antigens.  相似文献   

17.
18.
A high-performance liquid chromatography (HPLC) system was used to determine the antioxidants tert-butyl-hydroquinone (TBHQ), tert-butylhydroxyanisole (BHA), and 3,5-di-tert-butylhydroxytoluene (BHT) simultaneously in oils. The paper presents a new methodology for the optimized separation of antioxidants in oils based on the coupling of experimental design and artificial neural networks. The orthogonal design and the artificial neural networks with extended delta-bar-delta (EDBD) learning algorithm were employed to design the experiments and optimize the variables. The response function (Rf) used was a weighted linear combination of two variables related to separation efficiency and retention time, according to which the optimized conditions were obtained. The above-mentioned antioxidants in rapeseed oils were separated and determined simultaneously under optimized conditions by HPLC with UV detection at 280 nm. Linearity was obtained over the range of 10-200 microg/mL with recoveries of 98.3% (TBHQ), 98.1% (BHT), and 96.2% (BHA).  相似文献   

19.
《印度化学会志》2023,100(1):100815
The right combination of surfactants and stabilizers in the detergent formulations plays a significant role in their cleaning performance. However, it becomes a complex optimization problem when the formulation is composed of multiple ingredients and the solution has to be optimized for competing performance metrics. In recent times, machine learning techniques have been used extensively to study such processes. In this research, a detergent pre-formulation has been designed using an aqueous solution of Tween-20, Ethanol and 1-Octanol. To determine the optimal values of the ingredients of the formulations, supervised machine learning models were developed and optimized for the Ross Miles Index 30 ml (RMI 30) and cleaning time (CT). A full factorial experimental design was performed and three regression models based on linear, 2FI and Quadratic designs were developed respectively for RMI30 and CT. ANOVA analysis of trained models reported an optimal p-value of 0.0018 for RMI 30 and less than 0.0001 for CT. The optimal values for RMI30 and CT obtained through regression models are 72.32 ml and 17.67 s. For multi-objective optimization, grey relational analysis was performed. Two pairs of optimal values corresponding to Rank 1 were recorded as 88.9 ml, 20 s (RMI30, CT); and 81.2 ml, 14 s (RMI30, CT) respectively. As a result, the optimal combination of Tween-20, Ethanol and 1-Octanol for maximizing the RMI30 and minimizing the CT are reported. The obtained optimal values were experimentally validated.  相似文献   

20.
In patients with depression, the use of 5-HT reuptake inhibitors can improve the condition. Machine learning methods can be used in ligand-based activity prediction processes. In order to predict SERT inhibitors, the SERT inhibitor data from the ChEMBL database was screened and pre-processed. Then 4 machine learning methods (LR, SVM, RF, and KNN) and 4 molecular fingerprints (CDK, Graph, MACCS, and PubChem) were used to build 16 prediction models. The top 5 models of accuracy (Q) in the cross-validation of training set were used to build three different ensemble learning models. In the test1 set, the VOT_CLF3 model had the largest SP (0.871), Q (0.869), AUC (0.919), and MCC (0.728). In the unbalanced test2 set, VOT_CLF3 had the largest SE (0.857), SP (0.867), Q (0.865) and MCC (0.639). VOT_CLF3 was recommended for the virtual screening process of SERT inhibitors. In addition, 12 molecular structural alerts that frequently appear in SERT inhibitors were found (P < 0.05), which provided important reference value for the design work of SERT inhibitors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号