首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 20 毫秒
1.
Molecular latent representations, derived from autoencoders (AEs), have been widely used for drug or material discovery over the past couple of years. In particular, a variety of machine learning methods based on latent representations have shown excellent performance on quantitative structure–activity relationship (QSAR) modeling. However, the sequence feature of them has not been considered in most cases. In addition, data scarcity is still the main obstacle for deep learning strategies, especially for bioactivity datasets. In this study, we propose the convolutional recurrent neural network and transfer learning (CRNNTL) method inspired by the applications of polyphonic sound detection and electrocardiogram classification. Our model takes advantage of both convolutional and recurrent neural networks for feature extraction, as well as the data augmentation method. According to QSAR modeling on 27 datasets, CRNNTL can outperform or compete with state-of-art methods in both drug and material properties. In addition, the performances on one isomers-based dataset indicate that its excellent performance results from the improved ability in global feature extraction when the ability of the local one is maintained. Then, the transfer learning results show that CRNNTL can overcome data scarcity when choosing relative source datasets. Finally, the high versatility of our model is shown by using different latent representations as inputs from other types of AEs.  相似文献   

2.
Building a QSAR model of a new biological target for which few screening data are available is a statistical challenge. However, the new target may be part of a bigger family, for which we have more screening data. Collaborative filtering or, more generally, multi-task learning, is a machine learning approach that improves the generalization performance of an algorithm by using information from related tasks as an inductive bias. We use collaborative filtering techniques for building predictive models that link multiple targets to multiple examples. The more commonalities between the targets, the better the multi-target model that can be built. We show an example of a multi-target neural network that can use family information to produce a predictive model of an undersampled target. We evaluate JRank, a kernel-based method designed for collaborative filtering. We show their performance on compound prioritization for an HTS campaign and the underlying shared representation between targets. JRank outperformed the neural network both in the single- and multi-target models.  相似文献   

3.
Summary P-glycoprotein (P-gp), an ATP-binding cassette (ABC) transporter, functions as a biological barrier by extruding cytotoxic agents out of cells, resulting in an obstacle in chemotherapeutic treatment of cancer. In order to aid in the development of potential P-gp inhibitors, we constructed a quantitative structure–activity relationship (QSAR) model of flavonoids as P-gp inhibitors based on Bayesian-regularized neural network (BRNN). A dataset of 57 flavonoids collected from a literature binding to the C-terminal nucleotide-binding domain of mouse P-gp was compiled. The predictive ability of the model was assessed using a test set that was independent of the training set, which showed a standard error of prediction of 0.146 ± 0.006 (data scaled from 0 to 1). Meanwhile, two other mathematical tools, back-propagation neural network (BPNN) and partial least squares (PLS) were also attempted to build QSAR models. The BRNN provided slightly better results for the test set compared to BPNN, but the difference was not significant according to F-statistic at p = 0.05. The PLS failed to build a reliable model in the present study. Our study indicates that the BRNN-based in silico model has good potential in facilitating the prediction of P-gp flavonoid inhibitors and might be applied in further drug design.  相似文献   

4.
Neural networks and deep learning have been successfully applied to tackle problems in drug discovery with increasing accuracy over time. There are still many challenges and opportunities to improve molecular property predictions with satisfactory accuracy even further. Here, we proposed a deep-learning architecture model, namely Bidirectional long short-term memory with Channel and Spatial Attention network (BCSA), of which the training process is fully data-driven and end to end. It is based on data augmentation and SMILES tokenization technology without relying on auxiliary knowledge, such as complex spatial structure. In addition, our model takes the advantages of the long- and short-term memory network (LSTM) in sequence processing. The embedded channel and spatial attention modules in turn specifically identify the prime factors in the SMILES sequence for predicting properties. The model was further improved by Bayesian optimization. In this work, we demonstrate that the trained BSCA model is capable of predicting aqueous solubility. Furthermore, our proposed method shows noticeable superiorities and competitiveness in predicting oil–water partition coefficient, when compared with state-of-the-art graphs models, including graph convoluted network (GCN), message-passing neural network (MPNN), and AttentiveFP.  相似文献   

5.
Neural networks are rapidly gaining popularity in chemical modeling and Quantitative Structure–Activity Relationship (QSAR) thanks to their ability to handle multitask problems. However, outcomes of neural networks depend on the tuning of several hyperparameters, whose small variations can often strongly affect their performance. Hence, optimization is a fundamental step in training neural networks although, in many cases, it can be very expensive from a computational point of view. In this study, we compared four of the most widely used approaches for tuning hyperparameters, namely, grid search, random search, tree-structured Parzen estimator, and genetic algorithms on three multitask QSAR datasets. We mainly focused on parsimonious optimization and thus not only on the performance of neural networks, but also the computational time that was taken into account. Furthermore, since the optimization approaches do not directly provide information about the influence of hyperparameters, we applied experimental design strategies to determine their effects on the neural network performance. We found that genetic algorithms, tree-structured Parzen estimator, and random search require on average 0.08% of the hours required by grid search; in addition, tree-structured Parzen estimator and genetic algorithms provide better results than random search.  相似文献   

6.
The identification of small potent compounds that selectively bind to the target under consideration with high affinities is a critical step toward successful drug discovery. However, there is still a lack of efficient and accurate computational methods to predict compound selectivity properties. In this paper, we propose a set of machine learning methods to do compound selectivity prediction. In particular, we propose a novel cascaded learning method and a multitask learning method. The cascaded method decomposes the selectivity prediction into two steps, one model for each step, so as to effectively filter out nonselective compounds. The multitask method incorporates both activity and selectivity models into one multitask model so as to better differentiate compound selectivity properties. We conducted a comprehensive set of experiments and compared the results with those of other conventional selectivity prediction methods, and our results demonstrated that the cascaded and multitask methods significantly improve the selectivity prediction performance.  相似文献   

7.
8.
The aim of this study was to propose a QSAR modelling approach based on the combination of simple competitive learning (SCL) networks with radial basis function (RBF) neural networks for predicting the biological activity of chemical compounds. The proposed QSAR method consisted of two phases. In the first phase, an SCL network was applied to determine the centres of an RBF neural network. In the second phase, the RBF neural network was used to predict the biological activity of various phenols and Rho kinase (ROCK) inhibitors. The predictive ability of the proposed QSAR models was evaluated and compared with other QSAR models using external validation. The results of this study showed that the proposed QSAR modelling approach leads to better performances than other models in predicting the biological activity of chemical compounds. This indicated the efficiency of simple competitive learning networks in determining the centres of RBF neural networks.  相似文献   

9.
Drug repurposing identifies new clinical indications for existing drugs. It can be used to overcome common problems associated with cancers, such as heterogeneity and resistance to established therapies, by rapidly adapting known drugs for new treatment. In this study, we utilized a recommendation system learning model to prioritize candidate cancer drugs. We designed a drug–drug pathway functional similarity by integrating multiple genetic and epigenetic alterations such as gene expression, copy number variation (CNV), and DNA methylation. When compared with other similarities, such as SMILES chemical structures and drug targets based on the protein–protein interaction network, our approach provided better interpretable models capturing drug response mechanisms. Furthermore, our approach can achieve comparable accuracy when evaluated with other learning models based on large public datasets (CCLE and GDSC). A case study about the Erlotinib and OSI-906 (Linsitinib) indicated that they have a synergistic effect to reduce the growth rate of tumors, which is an alternative targeted therapy option for patients. Taken together, our computational method characterized drug response from the viewpoint of a multi-omics pathway and systematically predicted candidate cancer drugs with similar therapeutic effects.  相似文献   

10.
11.
Histone-modifying proteins have been identified as promising targets to treat several diseases including cancer and parasitic ailments. In silico methods have been incorporated within a variety of drug discovery programs to facilitate the identification and development of novel lead compounds. In this study, we explore the binding modes of a series of benzhydroxamates derivatives developed as histone deacetylase inhibitors of Schistosoma mansoni histone deacetylase (smHDAC) using molecular docking and binding free energy (BFE) calculations. The developed docking protocol was able to correctly reproduce the experimentally established binding modes of resolved smHDAC8–inhibitor complexes. However, as has been reported in former studies, the obtained docking scores weakly correlate with the experimentally determined activity of the studied inhibitors. Thus, the obtained docking poses were refined and rescored using the Amber software. From the computed protein–inhibitor BFE, different quantitative structure–activity relationship (QSAR) models could be developed and validated using several cross-validation techniques. Some of the generated QSAR models with good correlation could explain up to ~73% variance in activity within the studied training set molecules. The best performing models were subsequently tested on an external test set of newly designed and synthesized analogs. In vitro testing showed a good correlation between the predicted and experimentally observed IC50 values. Thus, the generated models can be considered as interesting tools for the identification of novel smHDAC8 inhibitors.  相似文献   

12.
13.
14.
15.
万金玉  刘怡飞 《化学通报》2019,82(10):926-936
随着有机磷化合物(OPs)的广泛应用,其在越来越多的环境介质中被检测出来。大多数OPs具有毒性,但人们缺乏快速且有效的预测手段来对毒性进行评估。本文将结合E-Dragon软件计算的分子描述符,采用不同的QSAR模型对36个OPs的毒性进行预测。文中采用后退法作为描述符筛选方法,以均方根误差(RMSE)作为评价标准,共找到14个对线性核函数支持向量机(SVM)模型贡献较大的描述符;在最终得到的SVM模型交叉验证结果中,计算值与实际值的相关系数为0. 913,均方根误差为0. 388;外部测试验证结果中,平均相对误差为9. 10%。此外,采用多元线性回归(MLR)、人工神经网络(ANN)以及偏最小二乘回归(PLS)模型对OPs的毒性进行预测,交叉验证结果显示,三个模型的计算值与实际值的相关系数分别为0. 878、0. 686与0. 620,没有SVM模型的预测能力好。因此采用线性核函数的SVM模型对OPs进行毒性预测是一个行之有效的方法。  相似文献   

16.
Drug-likeness prediction is important for the virtual screening of drug candidates. It is challenging because the drug-likeness is presumably associated with the whole set of necessary properties to pass through clinical trials, and thus no definite data for regression is available. Recently, binary classification models based on graph neural networks have been proposed but with strong dependency of their performances on the choice of the negative set for training. Here we propose a novel unsupervised learning model that requires only known drugs for training. We adopted a language model based on a recurrent neural network for unsupervised learning. It showed relatively consistent performance across different datasets, unlike such classification models. In addition, the unsupervised learning model provides drug-likeness scores that well separate distributions with increasing mean values in the order of datasets composed of molecules at a later step in a drug development process, whereas the classification model predicted a polarized distribution with two extreme values for all datasets presumably due to the overconfident prediction for unseen data. Thus, this new concept offers a pragmatic tool for drug-likeness scoring and further can be applied to other biochemical applications.

A new quantification method of drug-likeness based on unsupervised learning. The method only uses drug molecules as training set without any non-drug-like molecules.  相似文献   

17.
Antioxidants are important for maintaining the appropriate balance between oxidizing and reducing species in the body and thus preventing oxidative stress. Many natural compounds are being screened for their possible antioxidant activity. It was found that a mushroom pigment Norbadione A, which is a pulvinic acid derivative, shows an antioxidant activity; the same was found for other pulvinic acid derivatives and structurally related coumarines. Based on the results of in vitro studies performed on these compounds as a part of this study quantitative structure–activity relationship (QSAR) predictive models were constructed using multiple linear regression, counter-propagation artificial neural networks and support vector regression (SVR). The models have been developed in accordance with current QSAR guidelines, including the assessment of the models applicability domains. A new approach for the graphical evaluation of the applicability domain for SVR models is suggested. The developed models show sufficient predictive abilities for the screening of virtual libraries for new potential antioxidants.  相似文献   

18.
19.
ω-芋螺毒素属于海洋生物活性多肽,由24-31个氨基酸残基组成.特异性作用于电压敏感的钙离子通道(VGCCs),能够直接开发成药物或作为先导化合物进行新药开发.本文应用新型氨基酸残基结构描述符cscales和遗传偏最小二乘算法,对ω-芋螺毒素进行定量构效关系(QSAR)研究,并设计、构建了容量为2244个化合物的N-型和P/Q-型VGCC拮抗剂虚拟组合多肽库,然后分别采用QSAR模型预测和相似性搜索方法对组合多肽库进行了虚拟筛选.研究结果表明,建立的N-型和P/Q-型VGCC拮抗剂QSAR模型均具有较好的预测能力,交叉验证相关系数(CV-r2)均大于0.89.主成分分析和聚类分析结果表明,虚拟组合多肽库中化合物具有较好的结构多样性和差异性.通过虚拟筛选,得到了具有高预测活性的6个N-型和19个P/Q-型钙离子通道拮抗剂,为进一步的合成和活性评价奠定了理论基础.同时,本文建立的多肽QSAR预测模型和虚拟筛选策略,为其它多肽类化合物的定量构效关系研究和虚拟筛选提供了参考.  相似文献   

20.
The estrogen receptor α (ERα) is an important biological target mediating 17β-estradiol driven breast cancer (BC) development. Aiming to develop innovative drugs against BC, either wild-type or mutated ligand-ERα complexes were used as source data to build structure-based 3-D pharmacophore and 3-D QSAR models, afterward used as tools for the virtual screening of National Cancer Institute datasets and hit-to-lead optimization. The procedure identified Brefeldin A (BFA) as hit, then structurally optimized toward twelve new derivatives whose anticancer activity was confirmed both in vitro and in vivo. Compounds as SERMs showed picomolar to low nanomolar potencies against ERα and were then investigated as antiproliferative agents against BC cell lines, as stimulators of p53 expression, as well as BC cell cycle arrest agents. Most active leads were finally profiled upon administration to female Wistar rats with pre-induced BC, after which 3DPQ-12, 3DPQ-3, 3DPQ-9, 3DPQ-4, 3DPQ-2, and 3DPQ-1 represent potential candidates for BC therapy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号