首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Substructure mining using elaborate chemical representation
Authors:Kazius Jeroen  Nijssen Siegfried  Kok Joost  Bäck Thomas  Ijzerman Adriaan P
Institution:Division of Medicinal Chemistry, Leiden-Amsterdam Center for Drug Research, Leiden University, P.O. Box 9502, Einsteinweg 55, 2300 RA Leiden, The Netherlands. j.kazius@lacdr.leidenuniv.nl
Abstract:Substructure mining algorithms are important drug discovery tools since they can find substructures that affect physicochemical and biological properties. Current methods, however, only consider a part of all chemical information that is present within a data set of compounds. Therefore, the overall aim of our study was to enable more exhaustive data mining by designing methods that detect all substructures of any size, shape, and level of chemical detail. A means of chemical representation was developed that uses atomic hierarchies, thus enabling substructure mining to consider general and/or highly specific features. As a proof-of-concept, the efficient, multipurpose graph mining system Gaston learned substructures of any size and shape from a mutagenicity data set that was represented in this manner. From these substructures, we extracted a set of only six nonredundant, discriminative substructures that represent relevant biochemical knowledge. Our results demonstrate the individual and synergistic importance of elaborate chemical representation and mining for nonlinear substructures. We conclude that the combination of elaborate chemical representation and Gaston provides an excellent method for 2D substructure mining as this recipe systematically explores all substructures in different levels of chemical detail.
Keywords:
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号