首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The use of some unconventional non-linear modeling techniques, i.e. classification and regression trees and multivariate adaptive regression splines-based methods, was explored to model the blood-brain barrier (BBB) passage of drugs and drug-like molecules. The data set contains BBB passage values for 299 structural and pharmacological diverse drugs, originating from a structured knowledge-based database. Models were built using boosted regression trees (BRT) and multivariate adaptive regression splines (MARS), as well as their respective combinations with stepwise multiple linear regression (MLR) and partial least squares (PLS) regression in two-step approaches. The best models were obtained using combinations of MARS with either stepwise MLR or PLS. It could be concluded that the use of combinations of a linear with a non-linear modeling technique results in some improved properties compared to the individual linear and non-linear models and that, when the use of such a combination is appropriate, combinations using MARS as non-linear technique should be preferred over those with BRT, due to some serious drawbacks of the BRT approaches.  相似文献   

2.
Feature selection is commonly used as a preprocessing step to machine learning for improving learning performance, lowering computational complexity and facilitating model interpretation. This paper proposes the application of boosting feature selection to improve the classification performance of standard feature selection algorithms evaluated for the prediction of P-gp inhibitors and substrates. Two well-known classification algorithms, decision trees and support vector machines, were used to classify the chemical compounds. The experimental results showed better performance for boosting feature selection with respect to the standard feature selection algorithms while maintaining the capability for feature reduction.  相似文献   

3.
4.
5.
6.
7.
Three-dimensional quantitative structure–activity relationship (3D-QSAR) modelling was conducted on a series of leucine-rich repeat kinase 2 (LRRK2) antagonists using CoMFA and CoMSIA methods. The data set, which consisted of 37 molecules, was divided into training and test subsets by using a hierarchical clustering method. Both CoMFA and CoMSIA models were derived using a training set on the basis of the common substructure-based alignment. The optimum PLS model built by CoMFA and CoMSIA provided satisfactory statistical results (q2 = 0.589 and r2 = 0.927 and q2 = 0.473 and r2 = 0.802, respectively). The external predictive ability of the models was evaluated by using seven compounds. Moreover, an external evaluation set with known experimental data was used to evaluate the external predictive ability of the porposed models. The statistical parameters indicated that CoMFA (after region focusing) has high predictive ability in comparison with standard CoMFA and CoMSIA models. Molecular docking was also performed on the most active compound to investigate the existence of interactions between the most active inhibitor and the LRRK2 receptor. Based on the obtained results and CoMFA contour maps, some features were introduced to provide useful insights for designing novel and potent LRRK2 inhibitors.  相似文献   

8.
9.
Boosting is one of the most important strategies in ensemble learning because of its ability to improve the stability and performance of weak learners. It is nonparametric, multivariate, fast and interpretable but is not robust against outliers. To enhance its prediction accuracy as well as immunize it against outliers, a modified version of a boosting algorithm (AdaBoost R2) was developed and called AdaBoost R3. In the sampling step, extremum samples were added to the boosting set. In the robustness step, a modified Huber loss function was applied to overcome the outlier problem. In the output step, a deterministic threshold was used to guarantee that bad predictions do not participate in the final output. The performance of the modified algorithm was investigated with two anticancer data sets of tyrosine kinase inhibitors, and the mechanism of inhibition was studied using the relative weighted variable importance procedure. Investigating the effect of base learner's strength reveals that boosting is only successful using the classification and regression tree method (a weak to moderate learner) and does not have a significant effect using the radial basis functions partial least square method (a strong base learners). Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

10.
11.
12.
13.
14.
Three-dimensional QSAR models were developed for predicting kinetic Michaelis constant (K(m)) values for phenolic substrates of human catecholamine sulfating sulfotransferase (SULT1A3). The K(m) values were correlated to the steric and electronic molecular fields of the substrates utilizing Comparative Molecular Field Analysis (CoMFA). The evaluated SULT1A3 substrate data set consisted of 95 different substituted phenols, catechols, catecholamines, steroids, and related structures for which the K(m) values were available. The data set was divided in three different subgroups in the initial analysis: (1). for the first CoMFA model substrates with only one reacting hydroxyl group were selected (n = 51), (2).the second model was build with structurally rigid substrates (n = 59), and (3). finally all substrates of the data set were included in the analysis (n = 95). Substrate molecules were aligned using the aromatic ring and the reacting hydroxyl group as a template. After the initial analysis different substrate alignment rules based on the existing knowledge of the SULT1A3 active site structure were evaluated. After this optimization a final CoMFA model was built including all 95 substrates of the data set. Cross-validated q(2) values (leave-one-out and leave-n-out) and coefficient contour maps were calculated for all derived CoMFA models. All four CoMFA models were statistically significant with q(2) values up to 0.624. These predictive QSAR models will provide us information about the factors that affect substrate binding at the active site of human catecholamine sulfotransferase SULT1A3.  相似文献   

15.
16.
17.
18.
Prediction of molecular properties plays a critical role towards rational drug design. In this study, the Molecular Topographic Map (MTM) is proposed, which is a two-dimensional (2D) map that can be used to represent a molecule. An MTM is generated from the atomic features set of a molecule using generative topographic mapping and is then used as input data for analyzing structure-property/activity relationships. In the visualization and classification of 20 amino acids, differences of the amino acids can be visually confirmed from and revealed by hierarchical clustering with a similarity matrix of their MTMs. The prediction of molecular properties was performed on the basis of convolutional neural networks using MTMs as input data. The performance of the predictive models using MTM was found to be equal to or better than that using Morgan fingerprint or MACCS keys. Furthermore, data augmentation of MTMs using mixup has improved the prediction performance. Since molecules converted to MTMs can be treated like 2D images, they can be easily used with existing neural networks for image recognition and related technologies. MTM can be effectively utilized to predict molecular properties of small molecules to aid drug discovery research.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号