Analysis of data fusion methods in virtual screening: theoretical model |
| |
Authors: | Whittle Martin Gillet Valerie J Willett Peter Loesel Jens |
| |
Affiliation: | Department of Information Studies, University of Sheffield, 211 Portobello Street, Sheffield S1 4DP, UK. m.whittle@sheffield.ac.uk |
| |
Abstract: | This paper presents a theoretical model of how data fusion can be used to combine the results of multiple similarity searches of chemical databases. The model is based on frequency distributions of similarity values that are fused using a multiple integration over regions defined by the particular fusion rule that is being applied. For pairwise fusion, the resulting double integrals are straightforward to evaluate for simple model distributions. Similarity values for recovered-active and recovered-nonactive frequency distributions are independently modeled using a constant background, linearly biased terms, and a first-order correlated term. The model shows that two standard fusion rules can give performance enhancements in some cases but that the results of fusion are dependent on many factors that, taken together, can lead to seemingly inconsistent levels of enhancement. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|