期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

From activity cliffs to activity ridges: informative data structures for SAR analysis

Vogt M Huang Y Bajorath J 《Journal of chemical information and modeling》2011,51(8):1848-1856

The extraction of SAR information from structurally diverse compound data sets is a challenging task. One of the focal points of systematic SAR analysis is the search for activity cliffs, that is, structurally similar compounds having large potency differences, from which SAR determinants can be deduced. The assessment of SAR information is usually based on pairwise similarity and potency comparisons of data set compounds. As a consequence, activity cliffs are mostly evaluated at a compound pair level. Here, we present an extension of the activity cliff concept by introducing "activity ridges" that are formed by overlapping "combinatorial" activity cliffs between participating compounds, giving rise to ridge-like structures in activity landscapes. Activity ridges are rich in SAR information. In a systematic analysis of 242 compound data sets, we have identified well-defined activity ridges in 71 different sets. In addition, an information-theoretic approach has been devised to characterize the structural composition of activity ridges. Taken together, our results show that activity ridges frequently occur in sets of active compounds and that different categories of ridges can be distinguished on the basis of their structural content. The computational identification of activity ridges provides access to compound subsets having high priority for SAR analysis. 相似文献

2.

MMP-Cliffs: systematic identification of activity cliffs on the basis of matched molecular pairs

Hu X Hu Y Vogt M Stumpfe D Bajorath J 《Journal of chemical information and modeling》2012,52(5):1138-1145

相似文献

3.

Exploration of 3D activity cliffs on the basis of compound binding modes and comparison of 2D and 3D cliffs

Hu Y Bajorath J 《Journal of chemical information and modeling》2012,52(3):670-677

Activity cliffs are formed by pairs or groups of structurally similar compounds having large differences in potency and are focal points of structure-activity relationship (SAR) analysis. The choice of molecular representations is a critically important aspect of activity cliffs analysis. Thus far, activity cliffs have predominantly been defined on the basis of molecular graph or fingerprint representations. Herein we introduce 3D activity cliffs derived from comparisons of experimentally determined compound binding modes. The analysis of 3D activity cliffs is generally applicable to target proteins for which structures of multiple ligand complexes are available. For two popular targets, β-secretase 1 (BACE1) and factor Xa (FXa), public domain X-ray structures with bound inhibitors were collected. Crystallographic binding modes of inhibitors were systematically compared using a 3D similarity method taking conformational, positional, and atomic property differences into account. In addition, standard 2D similarity relationships were also determined. SAR information associated with individual compounds substantially changed when either bioactive conformations or 2D molecular graphs were used for similarity evaluation. 3D activity cliffs were identified for BACE1 and FXa inhibitor sets and systematically compared to 2D cliffs. It was found that less than 40% of 3D activity cliffs were conserved when 2D similarity was applied. The limited conservation of 3D and 2D cliffs provides further evidence for the strong molecule representation dependence of activity cliffs. Moreover, 3D cliffs represent a new class of activity cliffs that convey SAR information in ways that differ from graph-based similarity measures. In cases where sufficient structural information is available, the comparison of 3D and 2D cliffs is expected to aid in SAR analysis and mapping of critical binding determinants. 相似文献

4.

Multitarget structure-activity relationships characterized by activity-difference maps and consensus similarity measure

Medina-Franco JL Yongye AB Pérez-Villanueva J Houghten RA Martínez-Mayorga K 《Journal of chemical information and modeling》2011,51(9):2427-2439

Dual and triple activity-difference (DAD/TAD) maps are tools for the systematic characterization of structure-activity relationships (SAR) of compound data sets screened against two or three targets. DAD and TAD maps are two- and three- dimensional representations of the pairwise activity differences of compound data sets, respectively. Adding pairwise structural similarity information into these maps readily reveals activity cliff regions in the SAR for one, two, or three targets. In addition, pairs of compounds in the smooth regions of the SAR and scaffold hops are also easily identified in these maps. Herein, DAD and TAD maps are employed for the systematic characterization of the SAR of a benchmark set of 299 compounds screened against dopamine, norepinephrine, and serotonin transporters. To reduce the well-known dependence of the activity landscape on the structural representation, five selected 2D and 3D structure representations were used to characterize the SAR. Systematic analysis of the DAD and TAD maps reveals regions in the landscape with similar SAR for two or the three targets as well as regions with inverse SAR, i.e., changes in structure that increase activity for one target, but decrease activity for the other target. Focusing the analysis on pairs of compounds with high structure similarity revealed the presence of single-, dual-, and triple-target activity cliffs, i.e., small changes in structure with high changes in potency for one, two, or the three targets, respectively. Triple-target scaffold hops are also discussed. Activity cliffs and scaffold hops were also quantified and represented using two recently proposed approaches namely, mean Structure Activity Landscape Index (mean SALI) and Consensus Structure-Activity Similarity (SAS) maps. 相似文献

5.

Prediction of activity cliffs on the basis of images using convolutional neural networks

Iqbal Javed Vogt Martin Bajorath Jürgen 《Journal of computer-aided molecular design》2021,35(12):1157-1164

An activity cliff (AC) is formed by a pair of structurally similar compounds with a large difference in potency. Accordingly, ACs reveal structure–activity relationship (SAR) discontinuity and provide SAR information for compound optimization. Herein, we have investigated the question if ACs could be predicted from image data. Therefore, pairs of structural analogs were extracted from different compound activity classes that formed or did not form ACs. From these compound pairs, consistently formatted images were generated. Image sets were used to train and test convolutional neural network (CNN) models to systematically distinguish between ACs and non-ACs. The CNN models were found to predict ACs with overall high accuracy, as assessed using alternative performance measures, hence establishing proof-of-principle. Moreover, gradient weights from convolutional layers were mapped to test compounds and identified characteristic structural features that contributed to successful predictions. Weight-based feature visualization revealed the ability of CNN models to learn chemistry from images at a high level of resolution and aided in the interpretation of model decisions with intrinsic black box character.

相似文献

6.

Design of multitarget activity landscapes that capture hierarchical activity cliff distributions

Dimova D Wawer M Wassermann AM Bajorath J 《Journal of chemical information and modeling》2011,51(2):258-266

An activity landscape model of a compound data set can be rationalized as a graphical representation that integrates molecular similarity and potency relationships. Activity landscape representations of different design are utilized to aid in the analysis of structure-activity relationships and the selection of informative compounds. Activity landscape models reported thus far focus on a single target (i.e., a single biological activity) or at most two targets, giving rise to selectivity landscapes. For compounds active against more than two targets, landscapes representing multitarget activities are difficult to conceptualize and have not yet been reported. Herein, we present a first activity landscape design that integrates compound potency relationships across multiple targets in a formally consistent manner. These multitarget activity landscapes are based on a general activity cliff classification scheme and are visualized in graph representations, where activity cliffs are represented as edges. Furthermore, the contributions of individual compounds to structure-activity relationship discontinuity across multiple targets are monitored. The methodology has been applied to derive multitarget activity landscapes for compound data sets active against different target families. The resulting landscapes identify single-, dual-, and triple-target activity cliffs and reveal the presence of hierarchical cliff distributions. From these multitarget activity landscapes, compounds forming complex activity cliffs can be readily selected. 相似文献

7.

Systematic identification and classification of three-dimensional activity cliffs

Hu Y Furtmann N Gütschow M Bajorath J 《Journal of chemical information and modeling》2012,52(6):1490-1498

Activity cliffs were systematically extracted from public domain X-ray structures of targets for which complexes with multiple ligands were available, following the concept of three-dimensional (3D) cliffs. Binding modes of ligands with well-defined potency measurements were compared in a pairwise manner, and their 3D similarity was calculated using a previously reported property density function-based method taking conformational, positional, and chemical differences into account. Requiring the presence of at least 80% 3D similarity and a potency difference of at least 2 orders of magnitude as cliff criteria, a total of 216 well-defined 3D activity cliffs were detected in the Protein Data Bank (PDB). These 3D-cliffs involved a total of 269 ligands active against 38 different targets belonging to 17 protein families. For 255 of these compounds, binding modes were available at high crystallographic resolution. All 3D-cliffs were analyzed in detail and assigned to different categories on the basis of crystallographic interaction patterns. In many instances, differences in ligand-target interactions suggested plausible causes for origins of 3D-cliffs. In other cases, short-range interactions seen in X-ray structures were insufficient to deduce possible reasons for cliff formation. The 3D-cliffs described herein further advance the rationalization of activity cliffs at the level of ligand-target interactions and should also be useful for other applications such as the calibration of energy functions for structure-based design. The pool of identified activity cliffs is provided to enable subsequent structure-based analyses of cliffs. 相似文献

8.

Towards the understanding of the activity of G9a inhibitors: an activity landscape and molecular modeling approach

López-López Edgar Rabal Obdulia Oyarzabal Julen Medina-Franco José L. 《Journal of computer-aided molecular design》2020,34(6):659-669

In this work, we analyze the structure–activity relationships (SAR) of epigenetic inhibitors (lysine mimetics) against lysine methyltransferase (G9a or EHMT2) using a combined activity landscape, molecular docking and molecular dynamics approach. The study was based on a set of 251 G9a inhibitors with reported experimental activity. The activity landscape analysis rapidly led to the identification of activity cliffs, scaffolds hops and other active an inactive molecules with distinct SAR. Structure-based analysis of activity cliffs, scaffold hops and other selected active and inactive G9a inhibitors by means of docking followed by molecular dynamics simulations led to the identification of interactions with key residues involved in activity against G9a, for instance with ASP 1083, LEU 1086, ASP 1088, TYR 1154 and PHE 1158. The outcome of this work is expected to further advance the development of G9a inhibitors.

相似文献

9.

Consensus models of activity landscapes with multiple chemical, conformer, and property representations

Yongye AB Byler K Santos R Martínez-Mayorga K Maggiora GM Medina-Franco JL 《Journal of chemical information and modeling》2011,51(6):1259-1270

相似文献

10.

Interpretation of Ligand-Based Activity Cliff Prediction Models Using the Matched Molecular Pair Kernel

Shunsuke Tamura Swarit Jasial Tomoyuki Miyao Kimito Funatsu 《Molecules (Basel, Switzerland)》2021,26(16)

Activity cliffs (ACs) are formed by two structurally similar compounds with a large difference in potency. Accurate AC prediction is expected to help researchers’ decisions in the early stages of drug discovery. Previously, predictive models based on matched molecular pair (MMP) cliffs have been proposed. However, the proposed methods face a challenge of interpretability due to the black-box character of the predictive models. In this study, we developed interpretable MMP fingerprints and modified a model-specific interpretation approach for models based on a support vector machine (SVM) and MMP kernel. We compared important features highlighted by this SVM-based interpretation approach and the SHapley Additive exPlanations (SHAP) as a major model-independent approach. The model-specific approach could capture the difference between AC and non-AC, while SHAP assigned high weights to the features not present in the test instances. For specific MMPs, the feature weights mapped by the SVM-based interpretation method were in agreement with the previously confirmed binding knowledge from X-ray co-crystal structures, indicating that this method is able to interpret the AC prediction model in a chemically intuitive manner. 相似文献

11.

Rationalizing the role of SAR tolerance for ligand-based virtual screening

Ripphausen P Nisius B Wawer M Bajorath J 《Journal of chemical information and modeling》2011,51(4):837-842

It is well appreciated that the results of ligand-based virtual screening (LBVS) are much influenced by methodological details, given the generally strong compound class dependence of LBVS methods. It is less well understood to what extent structure-activity relationship (SAR) characteristics might influence the outcome of LBVS. We have assessed the hypothesis that the success of prospective LBVS depends on the SAR tolerance of screening targets, in addition to methodological aspects. In this context, SAR tolerance is rationalized as the ability of a target protein to specifically interact with series of structurally diverse active compounds. In compound data sets, SAR tolerance articulates itself as SAR continuity, i.e., the presence of structurally diverse compounds having similar potency. In order to analyze the role of SAR tolerance for LBVS, activity landscape representations of compounds active against 16 different target proteins were generated for which successful LBVS applications were reported. In all instances, the activity landscapes of known active compounds contained multiple regions of local SAR continuity. When analyzing the location of newly identified LBVS hits and their SAR environments, we found that these hits almost exclusively mapped to regions of distinct local SAR continuity. Taken together, these findings indicate the presence of a close link between SAR tolerance at the target level, SAR continuity at the ligand level, and the probability of LBVS success. 相似文献

12.

Assessing the confidence level of public domain compound activity data and the impact of alternative potency measurements on SAR analysis

Stumpfe D Bajorath J 《Journal of chemical information and modeling》2011,51(12):3131-3137

Publicly available compound activity data have been analyzed to distinguish between compounds for which single or multiple potency measurements were available and gain insight into data confidence levels. Different potency measurements with defined end points and alternative ways to represent multiple potency values for active compounds have been evaluated in the context of SAR analysis. Approximately 78% of all compounds with multiple potency measurements were found to represent high-confidence data, which corresponded to ～10% of all activity data. The use of different types of potency measurements and alternative representations of multiple potency values changed the SAR information content of compound data sets and resulted in different activity cliff distributions. Thus, the types of activity measurements that were available and how they were used substantially impacted SAR analysis. Compounds with multiple K(i) measurements provided the most reliable basis for SAR exploration. 相似文献

13.

Exploring uncharted territories: predicting activity cliﬀs in structure-activity landscapes

R Guha 《Journal of chemical information and modeling》2012,52(8):2181-2191

相似文献

14.

Structure--activity landscape index: identifying and quantifying activity cliffs

Guha R Van Drie JH 《Journal of chemical information and modeling》2008,48(3):646-658

A new method for analyzing a structure-activity relationship is proposed. By use of a simple quantitative index, one can readily identify "structure-activity cliffs": pairs of molecules which are most similar but have the largest change in activity. We show how this provides a graphical representation of the entire SAR, in a way that allows the salient features of the SAR to be quickly grasped. In addition, the approach allows us view the SARs in a data set at different levels of detail. The method is tested on two data sets that highlight its ability to easily extract SAR information. Finally, we demonstrate that this method is robust using a variety of computational control experiments and discuss possible applications of this technique to QSAR model evaluation. 相似文献

15.

Design of an activity landscape view taking compound-based feature probabilities into account

Bijun Zhang Martin Vogt Jürgen Bajorath 《Journal of computer-aided molecular design》2014,28(9):919-926

Activity landscapes (ALs) of compound data sets are rationalized as graphical representations that integrate similarity and potency relationships between active compounds. ALs enable the visualization of structure–activity relationship (SAR) information and are thus computational tools of interest for medicinal chemistry. For AL generation, similarity and potency relationships are typically evaluated in a pairwise manner and major AL features are assessed at the level of compound pairs. In this study, we add a conditional probability formalism to AL design that makes it possible to quantify the probability of individual compounds to contribute to characteristic AL features. Making this information graphically accessible in a molecular network-based AL representation is shown to further increase AL information content and helps to quickly focus on SAR-informative compound subsets. This feature probability-based AL variant extends the current spectrum of AL representations for medicinal chemistry applications. 相似文献

16.

Neighborhood behavior of in silico structural spaces with respect to in vitro activity spaces-a novel understanding of the molecular similarity principle in the context of multiple receptor binding profiles

Horvath D Jeandenans C 《Journal of chemical information and computer sciences》2003,43(2):680-690

As a consequence of recent advances in the field of High Throughput Screening, the systematic testing ("in vitro profiling") of compounds against a panel of targets covering different therapeutic areas is nowadays used to generate relevant information with respect to the in vivo behavior of drug candidates. However, the development of chemoinformatics tools required for the exploitation of such data is yet in an incipient phase. In this paper, a formalism for the analysis of activity profile vectors (describing the experimental responses of compounds in each of the considered activity tests) is introduced and applied at the study of Neighborhood Behavior (NB; the hypothesis that structurally similar compounds display similar biological properties) of molecular similarity metrics. The experimental activity profiles define an Activity Space in which more than 500 drugs and reference compounds are positioned, their coordinates being inhibitory propensities in the included tests and unambiguously characterizing a molecule in terms of its receptor binding properties. While previous studies of Neighborhood Behavior had to rely on a loose classification of compounds in terms of the therapeutic areas they were designed for, here the NB of a calculated "in silico" similarity metric has been redefined as a relationships between intermolecular dissimilarity scores in the "structural" and "activity" spaces, respectively, and expressed in terms of two quantitative criteria: "consistency" (the propensity of the metric to selectively rank activity-related compound pairs among the structurally most similar pairs) and "completeness" (monitoring the retrieval rate of activity-related compound pairs among the best ranked pairs of structural neighbors). These criteria were used to calibrate and validate a similarity metric based on Fuzzy Bipolar Pharmacophore Fingerprints. 相似文献

17.

SAR monitoring of evolving compound data sets using activity landscapes

Iyer P Hu Y Bajorath J 《Journal of chemical information and modeling》2011,51(3):532-540

In pharmaceutical research, collections of active compounds directed against specific therapeutic targets usually evolve over time. Small molecule discovery is an iterative process. New compounds are discovered, alternative compound series explored, some series discontinued, and others prioritized. The design of new compounds usually takes into consideration prior chemical and structure-activity relationship (SAR) knowledge. Hence, historically grown compound collections represent a viable source of chemical and SAR information that might be utilized to retrospectively analyze roadblocks in compound optimization and further guide discovery projects. However, SAR analysis of large and heterogeneous sets of active compounds is also principally complicated. We have subjected evolving compound data sets to SAR monitoring using activity landscape models in order to evaluate how composition and SAR characteristics might change over time. Chemotype and potency distributions in evolving data sets directed against different therapeutic targets were analyzed and alternative activity landscape representations generated at different points in time to monitor the progression of global and local SAR features. Our results show that the evolving data sets studied here have predominantly grown around seed clusters of active compounds that often emerged early on, while other SAR islands remained largely unexplored. Moreover, increasing scaffold diversity in evolving data sets did not necessarily yield new SAR patterns, indicating a rather significant influence of "me-too-ism" (i.e., introducing new chemotypes that are similar to already known ones) on the composition and SAR information content of the data sets. 相似文献

18.

Capturing structure-activity relationships from chemogenomic spaces

Wendt B Uhrig U Bös F 《Journal of chemical information and modeling》2011,51(4):843-851

Modeling off-target effects is one major goal of chemical biology, particularly in its applications to drug discovery. Here, we describe a new approach that allows the extraction of structure-activity relationships from large chemogenomic spaces starting from a single chemical structure. Several public source databases, offering a vast amount of data on structure and activity for a large number of different targets, have been investigated for their usefulness in automated structure-activity relationships (SAR) extraction. SAR tables were constructed by assembling similar structures around each query structure that have an activity record for a particular target. Quantitative series enrichment analysis (QSEA) was applied to these SAR tables to identify trends and to transform these trends into topomer CoMFA models. Overall more than 1700 SAR tables with topomer CoMFA models have been obtained from the ChEMBL, PubChem, and ChemBank databases. These models were able to highlight the structural trends associated with various off-target effects of marketed drugs, including cases where other structural similarity metrics would not have detected an off-target effect. These results indicate the usefulness of the QSEA approach, particularly whenever applicable with public databases, in providing a new means, beyond a simple similarity between ligand structures, to capture SAR trends and thereby contribute to success in drug discovery. 相似文献

19.

Adapting the DeepSARM approach for dual-target ligand design

Yoshimori Atsushi Hu Huabin Bajorath Jürgen 《Journal of computer-aided molecular design》2021,35(5):587-600

The structure–activity relationship (SAR) matrix (SARM) methodology and data structure was originally developed to extract structurally related compound series from data sets of any composition, organize these series in matrices reminiscent of R-group tables, and visualize SAR patterns. The SARM approach combines the identification of structural relationships between series of active compounds with analog design, which is facilitated by systematically exploring combinations of core structures and substituents that have not been synthesized. The SARM methodology was extended through the introduction of DeepSARM, which added deep learning and generative modeling to target-based analog design by taking compound information from related targets into account to further increase structural novelty. Herein, we present the foundations of the SARM methodology and discuss how DeepSARM modeling can be adapted for the design of compounds with dual-target activity. Generating dual-target compounds represents an equally attractive and challenging task for polypharmacology-oriented drug discovery. The DeepSARM-based approach is illustrated using a computational proof-of-concept application focusing on the design of candidate inhibitors for two prominent anti-cancer targets.

相似文献

20.

Models of steroid binding based on the minimum deviation of structurally assigned 13C NMR spectra analysis (MiDSASA)

Beger RD Harris S Xie Q 《Journal of chemical information and computer sciences》2004,44(4):1489-1496

This paper develops a quantitative k-nearest neighbors modeling technique. The technique is used to demonstrate that a compound's biological binding activity to a receptor can be calculated from the minimum of the square root of the sum of squared deviations (SSSD) of a structurally assigned chemical shift on a template between the unknown compound to be predicted and a set of known compounds with known activities. When building models of biological activity, nonlinear relationships are built into the input training data. If a model is developed by selecting only compounds with minimum structurally assigned chemical shift deviations from the unknown compound, some of the nonlinear relationships can be removed. The smaller the total chemical shift deviation between a compound with known activity and another compound with unknown activity, the more likely it will have similar biological, chemical, and physical properties. This means that a model can be produced without rigorous statistics or neural networks. This technique is similar to structure-activity relationship (SAR) modeling, but instead of relying on substructure fragments to produce a model, this new model is based on minimum chemical shift differences on those substructure fragments. We refer to this method as minimum deviation of structurally assigned spectra analysis (MiDSASA) modeling. Modeling by the minimum deviation concept can be applied to other chemoinformatic data analyses such as metabolite concentrations in metabolic pathways for metabolomics research. A MiDSASA template model for 30 steroids binding the corticosterone binding globulin based on the activity factors of the two nearest compounds had a correlation of 0.88. A MiDSASA template model for 50 steroids binding the aromatse enzyme based on the average activity of the four nearest compounds had a correlation of 0.71. 相似文献