首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Predictive models for tyrosinase inhibitors: Challenges from heterogeneous activity data determined by different experimental protocols
Institution:1. Key Laboratory of Synthetic Rubber, Changchun Institute of Applied Chemistry (CIAC), Chinese Academy of Sciences, Changchun 130022, PR China;2. School of Life Science, Jilin University, Changchun 130012, PR China;3. University of Chinese Academy of Sciences, Beijing 100049, PR China;1. Department of Bioscience and Technology, School of Agriculture and Bio science, Karunya Institute of Technology and Science, Coimbatore 641114;2. Department of Bio Technology, Mepco Schlenk Engineering College, Sivakasi, Tamilnadu;1. Laboratory of Systems Genetics, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20892, United States;2. Department of Statistics, Purdue University, West Lafayette, IN 47907, United States;1. MRDL, PG and Research Department of Physics, Pachaiyappa’s College, Chennai, 600 030, Tamil Nadu, India;2. PG and Research Department of Physics, Arignar Anna Government Arts College, Cheyyar, Thiruvannamalai District, 604 407, Tamil Nadu, India;3. Research and Development Centre, Bharathiar University, Coimbatore, 641 046, Tamil Nadu, India;4. Department of Physics, The New College, Chennai, 600 014, Tamil Nadu, India;5. Department of Chemistry, Annamalai University, Annamalai Nagar, Chidambaram, 608 002, Tamil Nadu, India;1. Institute of Petroleum Engineering, Heriot-Watt University, Edinburgh, United Kingdom;2. Center for Subsurface Modeling (CSM), Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, TX, USA
Abstract:Quantitative Structure-Activity Relationship (QSAR) models of tyrosinase inhibitors were built using Random Forest (RF) algorithm and evaluated by the out-of-bag estimation (R2OOB) and 10-fold cross validation (Q2CV). We found that the performances of QSAR models were closely correlated with the systematic errors of inhibitory activities of tyrosinase inhibitors arising from the different measuring protocols. By defining ERRsys, outliers with larger errors can be efficiently identified and removed from heterogeneous activity data. A reasonable QSAR model (R2OOB of 0.74 and Q2CV of 0.80) was obtained by the exclusion of 13 outliers with larger systematic errors. It is a clear example of the challenge for QSAR model that can overwhelm heterogeneous data from different experimental protocols.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号