首页 | 本学科首页   官方微博 | 高级检索  
     检索      


An improved ensemble learning machine for biological activity prediction of tyrosine kinase inhibitors
Authors:Hossein Tavakoli  Jahan B Ghasemi
Abstract:Boosting is one of the most important strategies in ensemble learning because of its ability to improve the stability and performance of weak learners. It is nonparametric, multivariate, fast and interpretable but is not robust against outliers. To enhance its prediction accuracy as well as immunize it against outliers, a modified version of a boosting algorithm (AdaBoost R2) was developed and called AdaBoost R3. In the sampling step, extremum samples were added to the boosting set. In the robustness step, a modified Huber loss function was applied to overcome the outlier problem. In the output step, a deterministic threshold was used to guarantee that bad predictions do not participate in the final output. The performance of the modified algorithm was investigated with two anticancer data sets of tyrosine kinase inhibitors, and the mechanism of inhibition was studied using the relative weighted variable importance procedure. Investigating the effect of base learner's strength reveals that boosting is only successful using the classification and regression tree method (a weak to moderate learner) and does not have a significant effect using the radial basis functions partial least square method (a strong base learners). Copyright © 2015 John Wiley & Sons, Ltd.
Keywords:tyrosine kinase inhibitors  classification and regression tree  radial basis functions partial least square  ensemble learning  adaptive boosting
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号