首页 | 本学科首页   官方微博 | 高级检索  
     检索      


MEMES: Machine learning framework for Enhanced MolEcular Screening
Authors:Sarvesh Mehta  Siddhartha Laghuvarapu  Yashaswi Pathak  Aaftaab Sethi  Mallika Alvala  U Deva Priyakumar
Institution:Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032 India, Fax: +91 40 6653 1413, +91 40 6653 1161 ; Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, Hyderabad 500 037 India ; School of Pharmacy and Technology Management, Narsee Monjee Institute of Management Sciences, Hyderabad India
Abstract:In drug discovery applications, high throughput virtual screening exercises are routinely performed to determine an initial set of candidate molecules referred to as “hits”. In such an experiment, each molecule from a large small-molecule drug library is evaluated in terms of physical properties such as the docking score against a target receptor. In real-life drug discovery experiments, drug libraries are extremely large but still there is only a minor representation of the essentially infinite chemical space, and evaluation of physical properties for each molecule in the library is not computationally feasible. In the current study, a novel Machine learning framework for Enhanced MolEcular Screening (MEMES) based on Bayesian optimization is proposed for efficient sampling of the chemical space. The proposed framework is demonstrated to identify 90% of the top-1000 molecules from a molecular library of size about 100 million, while calculating the docking score only for about 6% of the complete library. We believe that such a framework would tremendously help to reduce the computational effort in not only drug-discovery but also areas that require such high-throughput experiments.

A novel machine learning framework based on Bayesian optimization for efficient sampling of chemical space. The framework is able to identify 90% of top-1000 hits by only sampling 6% of the complete dataset containing ∼100 million compounds.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号