首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Database clustering with a combination of fingerprint and maximum common substructure methods
Authors:Stahl Martin  Mauser Harald
Institution:F. Hoffmann-La Roche AG, Pharmaceutical Research, CH-4070 Basel, Switzerland.
Abstract:We present an efficient method to cluster large chemical databases in a stepwise manner. Databases are first clustered with an extended exclusion sphere algorithm based on Tanimoto coefficients calculated from Daylight fingerprints. Substructures are then extracted from clusters by iterative application of a maximum common substructure algorithm. Clusters with common substructures are merged through a second application of an exclusion sphere algorithm. In a separate step, singletons are compared to cluster substructures and added to a cluster if similarity is sufficiently high. The method identifies tight clusters with conserved substructures and generates singletons only if structures are truly distinct from all other library members. The method has successfully been applied to identify the most frequently occurring scaffolds in databases, for the selection of analogues of screening hits and in the prioritization of chemical libraries offered by commercial vendors.
Keywords:
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号