首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Identification of small molecule aggregators from large compound libraries by support vector machines
Authors:Hanbing Rao  Zerong Li  Xiangyuan Li  Xiaohua Ma  Choongyong Ung  Hu Li  Xianghui Liu  Yuzong Chen
Institution:1. College of Chemistry, Sichuan University, Chengdu 610064, People's Republic of China;2. College of Chemical Engineering, Sichuan University, Chengdu 610065, People's Republic of China;3. Bioinformatics and Drug Design Group, Department of Pharmacy, Center for Computational Science and Engineering, National University of Singapore, 117543 Singapore
Abstract:Small molecule aggregators non‐specifically inhibit multiple unrelated proteins, rendering them therapeutically useless. They frequently appear as false hits and thus need to be eliminated in high‐throughput screening campaigns. Computational methods have been explored for identifying aggregators, which have not been tested in screening large compound libraries. We used 1319 aggregators and 128,325 non‐aggregators to develop a support vector machines (SVM) aggregator identification model, which was tested by four methods. The first is five fold cross‐validation, which showed comparable aggregator and significantly improved non‐aggregator identification rates against earlier studies. The second is the independent test of 17 aggregators discovered independently from the training aggregators, 71% of which were correctly identified. The third is retrospective screening of 13M PUBCHEM and 168K MDDR compounds, which predicted 97.9% and 98.7% of the PUBCHEM and MDDR compounds as non‐aggregators. The fourth is retrospective screening of 5527 MDDR compounds similar to the known aggregators, 1.14% of which were predicted as aggregators. SVM showed slightly better overall performance against two other machine learning methods based on five fold cross‐validation studies of the same settings. Molecular features of aggregation, extracted by a feature selection method, are consistent with published profiles. SVM showed substantial capability in identifying aggregators from large libraries at low false‐hit rates. © 2009 Wiley Periodicals, Inc.J Comput Chem, 2010
Keywords:active compound  aggregator  aggregation  drug discovery  high throughput screening  machine learning method  recursive feature elimination  support vector machine  virtual screening
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号