期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Database searching for compounds with similar biological activity using short binary bit string representations of molecules.

L Xue J W Godden J Bajorath 《Journal of chemical information and computer sciences》1999,39(5):881-886

相似文献

2.

Anatomy of fingerprint search calculations on structurally diverse sets of active compounds

Godden JW Stahura FL Bajorath J 《Journal of chemical information and modeling》2005,45(6):1812-1819

Similarity searching using molecular fingerprints is a widely used approach for the identification of novel hits. A fingerprint search involves many pairwise comparisons of bit string representations of known active molecules with those precomputed for database compounds. Bit string overlap, as evaluated by various similarity metrics, is used as a measure of molecular similarity. Results of a number of studies focusing on fingerprints suggest that it is difficult, if not impossible, to develop generally applicable search parameters and strategies, irrespective of the compound classes under investigation. Rather, more or less, each individual search problem requires an adjustment of calculation conditions. Thus, there is a need for diagnostic tools to analyze fingerprint-based similarity searching. We report an analysis of fingerprint search calculations on different sets of structurally diverse active compounds. Calculations on five biological activity classes were carried out with two fingerprints in two compound source databases, and the results were analyzed in histograms. Tanimoto coefficient (Tc) value ranges where active compounds were detected were compared to the distribution of Tc values in the database. The analysis revealed that compound class-specific effects strongly influenced the outcome of these fingerprint calculations. Among the five diverse compound sets studied, very different search results were obtained. The analysis described here can be applied to determine Tc intervals where scaffold hopping occurs. It can also be used to benchmark fingerprint calculations or estimate their probability of success. 相似文献

3.

Introduction of the conditional correlated Bernoulli model of similarity value distributions and its application to the prospective prediction of fingerprint search performance

Vogt M Bajorath J 《Journal of chemical information and modeling》2011,51(10):2496-2506

A statistical approach named the conditional correlated Bernoulli model is introduced for modeling of similarity scores and predicting the potential of fingerprint search calculations to identify active compounds. Fingerprint features are rationalized as dependent Bernoulli variables and conditional distributions of Tanimoto similarity values of database compounds given a reference molecule are assessed. The conditional correlated Bernoulli model is utilized in the context of virtual screening to estimate the position of a compound obtaining a certain similarity value in a database ranking. Through the generation of receiver operating characteristic curves from cumulative distribution functions of conditional similarity values for known active and random database compounds, one can predict how successful a fingerprint search might be. The comparison of curves for different fingerprints makes it possible to identify fingerprints that are most likely to identify new active molecules in a database search given a set of known reference molecules. 相似文献

4.

How do 2D fingerprints detect structurally diverse active compounds? Revealing compound subset-specific fingerprint features through systematic selection

Heikamp K Bajorath J 《Journal of chemical information and modeling》2011,51(9):2254-2265

相似文献

5.

Support-vector-machine-based ranking significantly improves the effectiveness of similarity searching using 2D fingerprints and multiple reference compounds

Geppert H Horváth T Gärtner T Wrobel S Bajorath J 《Journal of chemical information and modeling》2008,48(4):742-746

相似文献

6.

Xue L Godden JW Stahura FL Bajorath J 《Journal of chemical information and computer sciences》2004,44(4):1275-1281

An analysis method termed similarity search profiling has been developed to evaluate fingerprint-based virtual screening calculations. The analysis is based on systematic similarity search calculations using multiple template compounds over the entire value range of a similarity coefficient. In graphical representations, numbers of correctly identified hits and other detected database compounds are separately monitored. The resulting profiles make it possible to determine whether a virtual screening trial can in principle succeed for a given compound class, search tool, similarity metric, and selection criterion. As a test case, we have analyzed virtual screening calculations using a recently designed fingerprint on 23 different biological activity classes in a compound source database containing approximately 1.3 million molecules. Based on our predefined selection criteria, we found that virtual screening analysis was successful for 19 of 23 compound classes. Profile analysis also makes it possible to determine compound class-specific similarity threshold values for similarity searching. 相似文献

7.

Profile scaling increases the similarity search performance of molecular fingerprints containing numerical descriptors and structural keys

Xue L Godden JW Stahura FL Bajorath J 《Journal of chemical information and computer sciences》2003,43(4):1218-1225

相似文献

8.

Xue L Stahura FL Bajorath J 《Journal of chemical information and computer sciences》2004,44(6):2032-2039

Fingerprint scaling is a method to increase the performance of similarity search calculations. It is based on the detection of bit patterns in keyed fingerprints that are signatures of specific compound classes. Application of scaling factors to consensus bits that are mostly set on emphasizes signature bit patterns during similarity searching and has been shown to improve search results for different fingerprints. Similarity search profiling has recently been introduced as a method to analyze similarity search calculations. Profiles separately monitor correctly identified hits and other detected database compounds as a function of similarity threshold values and make it possible to estimate whether virtual screening calculations can be successful or to evaluate why they fail. This similarity search profile technique has been applied here to study fingerprint scaling in detail and better understand effects that are responsible for its performance. In particular, we have focused on the qualitative and quantitative analysis of similarity search profiles under scaling conditions. Therefore, we have carried out systematic similarity search calculations for 23 biological activity classes under scaling conditions over a wide range of scaling factors in a compound database containing approximately 1.3 million molecules and monitored these calculations in similarity search profiles. Analysis of these profiles confirmed increases in hit rates as a consequence of scaling and revealed that scaling influences similarity search calculations in different ways. Based on scaled similarity search profiles, compound sets could be divided into different categories. In a number of cases, increases in search performance under scaling conditions were due to a more significant relative increase in correctly identified hits than detected false-positives. This was also consistent with the finding that preferred similarity threshold values increased due to fingerprint scaling, which was well illustrated by similarity search profiling. 相似文献

9.

Bayesian interpretation of a distance function for navigating high-dimensional descriptor spaces

Vogt M Godden JW Bajorath J 《Journal of chemical information and modeling》2007,47(1):39-46

相似文献

10.

Design and evaluation of a novel class-directed 2D fingerprint to search for structurally diverse active compounds

Eckert H Bajorath J 《Journal of chemical information and modeling》2006,46(6):2515-2526

相似文献

11.

Chemical database mining through entropy-based molecular similarity assessment of randomly generated structural fragment populations

Batista J Bajorath J 《Journal of chemical information and modeling》2007,47(1):59-68

相似文献

12.

Mini-fingerprints for virtual screening: design principles and generation of novel prototypes based on information theory

Xue L Godden JW Bajorath J 《SAR and QSAR in environmental research》2003,14(1):27-40

相似文献

13.

Mini-fingerprints for virtual screening: Design principles and generation of novel prototypes based on information theory

L. Xue J.W. Godden J. Bajorath 《SAR and QSAR in environmental research》2013,24(1):27-40

相似文献

14.

Heikamp K Bajorath J 《Journal of chemical information and modeling》2011,51(8):1831-1839

A large-scale similarity search investigation has been carried out on 266 well-defined compound activity classes extracted from the ChEMBL database. The analysis was performed using two widely applied two-dimensional (2D) fingerprints that mark opposite ends of the current performance spectrum of these types of fingerprints, i.e., MACCS structural keys and the extended connectivity fingerprint with bond diameter four (ECFP4). For each fingerprint, three nearest neighbor search strategies were applied. On the basis of these search calculations, a similarity search profile of the ChEMBL database was generated. Overall, the fingerprint search campaign was surprisingly successful. In 203 of 266 test cases (～76%), a compound recovery rate of at least 50% was observed with at least the better performing fingerprint and one search strategy. The similarity search profile also revealed several general trends. For example, fingerprint searching was often characterized by an early enrichment of active compounds in database selection sets. In addition, compound activity classes have been categorized according to different similarity search performance levels, which helps to put the results of benchmark calculations into perspective. Therefore, a compendium of activity classes falling into different search performance categories is provided. On the basis of our large-scale investigation, the performance range of state-of-the-art 2D fingerprinting has been delineated for compound data sets directed against a wide spectrum of pharmaceutical targets. 相似文献

15.

Speeding up chemical database searches using a proximity filter based on the logical exclusive or

Baldi P Hirschberg DS Nasr RJ 《Journal of chemical information and modeling》2008,48(7):1367-1378

In many large chemoinformatics database systems, molecules are represented by long binary fingerprint vectors whose components record the presence or absence in the molecular graphs of particular functional groups or combinatorial features, such as labeled paths or labeled trees. To speed up database searches, we propose to store with each fingerprint a small header vector containing primarily the result of applying the logical exclusive OR (XOR) operator to the fingerprint vector after modulo wrapping to a smaller number of bits, such as 128 bits. From the XOR headers of two molecules, tight bounds on the intersection and union of their fingerprint vectors can be rapidly obtained, yielding tight bounds on derived similarity measures, such as the Tanimoto measure. During a database search, every time these bounds are unfavorable, the corresponding molecule can be rapidly discarded with no need for further inspection. We derive probabilistic models that allow us to estimate precisely the behavior of the XOR headers and the level of pruning under different conditions in terms of similarity threshold and fingerprint density. These theoretical results are corroborated by experimental results on a large set of molecules. For a Tanimoto threshold of 0.5 (respectively 0.9), this approach requires searching less than 50% (respectively 10%) of the database, leading to typical search speedups of 2 to 3 times over the previous state-of-the-art. 相似文献

16.

Evaluation of descriptors and mini-fingerprints for the identification of molecules with similar activity

Xue L Godden JW Bajorath J 《Journal of chemical information and computer sciences》2000,40(5):1227-1234

相似文献

17.

Design and evaluation of a molecular fingerprint involving the transformation of property descriptor values into a binary classification scheme

Xue L Godden JW Stahura FL Bajorath J 《Journal of chemical information and computer sciences》2003,43(4):1151-1157

相似文献

18.

Comparison of correlation vector methods for ligand-based similarity searching 总被引：1，自引：0，他引：1

Fechner U Franke L Renner S Schneider P Schneider G 《Journal of computer-aided molecular design》2003,17(10):687-698

相似文献

19.

Bender A Glen RC 《Organic & biomolecular chemistry》2004,2(22):3204-3218

相似文献

20.

Fast drug-receptor mapping by site-directed distances: a novel method of predicting new pharmacological leads 总被引：1，自引：0，他引：1

A S Smellie G M Crippen W G Richards 《Journal of chemical information and computer sciences》1991,31(3):386-392

The searching and characterization of large chemical databases has recently provoked much interest, particularly with respect to the question of whether any of the compounds in the database could serve as new leads to a compound of pharmacological interest. This paper introduces a fast and novel method of determining whether any of a given series of compounds are able, on geometrical grounds, to interact with an active site of interest. The C program written to implement the method is able to make a qualitative prediction for a given compound in about 1 s per structure (for drug-sized molecules), while still permitting the compound complete conformational freedom. However, the algorithm is sufficiently flexible to permit distance constraints to be placed on the molecules while docking. The test system studied was a family of Baker's triazines docking into the active site of dihydrofolate reductase (DHFR), as defined by a methotrexate/NADPH complex. 相似文献