首页 | 本学科首页   官方微博 | 高级检索  
     


A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics
Authors:Laura Isigkeit  Apirat Chaikuad  Daniel Merk
Affiliation:1.Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, 60438 Frankfurt, Germany; (L.I.); (A.C.);2.Structural Genomics Consortium, BMLS, Goethe University Frankfurt, 60438 Frankfurt, Germany;3.Department of Pharmacy, Ludwig Maximilian University of Munich, 81377 Munich, Germany
Abstract:Publicly available compound and bioactivity databases provide an essential basis for data-driven applications in life-science research and drug design. By analyzing several bioactivity repositories, we discovered differences in compound and target coverage advocating the combined use of data from multiple sources. Using data from ChEMBL, PubChem, IUPHAR/BPS, BindingDB, and Probes & Drugs, we assembled a consensus dataset focusing on small molecules with bioactivity on human macromolecular targets. This allowed an improved coverage of compound space and targets, and an automated comparison and curation of structural and bioactivity data to reveal potentially erroneous entries and increase confidence. The consensus dataset comprised of more than 1.1 million compounds with over 10.9 million bioactivity data points with annotations on assay type and bioactivity confidence, providing a useful ensemble for computational applications in drug design and chemogenomics.
Keywords:big data   data curation   medicinal chemistry   machine learning   de novo design
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号