首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
The extraction of SAR information from structurally diverse compound data sets is a challenging task. One of the focal points of systematic SAR analysis is the search for activity cliffs, that is, structurally similar compounds having large potency differences, from which SAR determinants can be deduced. The assessment of SAR information is usually based on pairwise similarity and potency comparisons of data set compounds. As a consequence, activity cliffs are mostly evaluated at a compound pair level. Here, we present an extension of the activity cliff concept by introducing "activity ridges" that are formed by overlapping "combinatorial" activity cliffs between participating compounds, giving rise to ridge-like structures in activity landscapes. Activity ridges are rich in SAR information. In a systematic analysis of 242 compound data sets, we have identified well-defined activity ridges in 71 different sets. In addition, an information-theoretic approach has been devised to characterize the structural composition of activity ridges. Taken together, our results show that activity ridges frequently occur in sets of active compounds and that different categories of ridges can be distinguished on the basis of their structural content. The computational identification of activity ridges provides access to compound subsets having high priority for SAR analysis.  相似文献   

2.
Increasingly, chemical libraries are being produced which are focused on a biological target or group of related targets, rather than simply being constructed in a combinatorial fashion. A screening collection compiled from such libraries will contain multiple analogues of a number of discrete series of compounds. The question arises as to how many analogues are necessary to represent each series in order to ensure that an active series will be identified. Based on a simple probabilistic argument and supported by in-house screening data, guidelines are given for the number of compounds necessary to achieve a "hit", or series of hits, at various levels of certainty. Obtaining more than one hit from the same series is useful since this gives early acquisition of SAR (structure-activity relationship) and confirms a hit is not a singleton. We show that screening collections composed of only small numbers of analogues of each series are sub-optimal for SAR acquisition. Based on these studies, we recommend a minimum series size of about 200 compounds. This gives a high probability of confirmatory SAR (i.e. at least two hits from the same series). More substantial early SAR (at least 5 hits from the same series) can be gained by using series of about 650 compounds each. With this level of information being generated, more accurate assessment of the likely success of the series in hit-to-lead and later stage development becomes possible.  相似文献   

3.
Ligand-based virtual screening (LBVS) and structure-based virtual screening (SBVS) approaches were used to identify new inhibitors for ATAD2 bromodomain. The LBVS approach was used to search 23,129,083 clean compounds to identify compounds similar to an active compound with reported pIC50 equal to 7.2. Based on LBVS results, 19 compounds were selected. To perform SBVS, by applying nine filters on 23,129,083 clean compounds, 1,057,060 compounds were selected. After performing SBVS on these selected compounds with idock software, 16 compounds with the lowest binding energies were selected. More accurate molecular docking analysis was performed on these 35 selected compounds by using iGEMDOCK software and six of them with the lowest binding energies were selected as hit compounds. These compounds were zinc36647229, zinc77969074, zinc13637358, zinc77971540, zinc12991296 and zinc19374204.  相似文献   

4.
Integration of flexible data-analysis tools with cheminformatics methods is a prerequisite for successful identification and validation of “hits” in high-throughput screening (HTS) campaigns. We have designed, developed, and implemented a suite of robust yet flexible cheminformatics tools to support HTS activities at the Broad Institute, three of which are described herein. The “hit-calling” tool allows a researcher to set a hit threshold that can be varied during downstream analysis. The results from the hit-calling exercise are reported to a database for record keeping and further data analysis. The “cherry-picking” tool enables creation of an optimized list of hits for confirmatory and follow-up assays from an HTS hit list. This tool allows filtering by computed chemical property and by substructure. In addition, similarity searches can be performed on hits of interest and sets of related compounds can be selected. The third tool, an “S/SAR viewer,” has been designed specifically for the Broad Institute’s diversity-oriented synthesis (DOS) collection. The compounds in this collection are rich in chiral centers and the full complement of all possible stereoisomers of a given compound are present in the collection. The S/SAR viewer allows rapid identification of both structure/activity relationships and stereo-structure/activity relationships present in HTS data from the DOS collection. Together, these tools enable the prioritization and analysis of hits from diverse compound collections, and enable informed decisions for follow-up biology and chemistry efforts.  相似文献   

5.
The analysis of structure–activity relationships (SARs) becomes rather challenging when large and heterogeneous compound data sets are studied. In such cases, many different compounds and their activities need to be compared, which quickly goes beyond the capacity of subjective assessments. For a comprehensive large-scale exploration of SARs, computational analysis and visualization methods are required. Herein, we introduce a two-layered SAR visualization scheme specifically designed for increasingly large compound data sets. The approach combines a new compound pair-based variant of generative topographic mapping (GTM), a machine learning approach for nonlinear mapping, with chemical space networks (CSNs). The GTM component provides a global view of the activity landscapes of large compound data sets, in which informative local SAR environments are identified, augmented by a numerical SAR scoring scheme. Prioritized local SAR regions are then projected into CSNs that resolve these regions at the level of individual compounds and their relationships. Analysis of CSNs makes it possible to distinguish between regions having different SAR characteristics and select compound subsets that are rich in SAR information.  相似文献   

6.
Similarity searching using molecular fingerprints is a widely used approach for the identification of novel hits. A fingerprint search involves many pairwise comparisons of bit string representations of known active molecules with those precomputed for database compounds. Bit string overlap, as evaluated by various similarity metrics, is used as a measure of molecular similarity. Results of a number of studies focusing on fingerprints suggest that it is difficult, if not impossible, to develop generally applicable search parameters and strategies, irrespective of the compound classes under investigation. Rather, more or less, each individual search problem requires an adjustment of calculation conditions. Thus, there is a need for diagnostic tools to analyze fingerprint-based similarity searching. We report an analysis of fingerprint search calculations on different sets of structurally diverse active compounds. Calculations on five biological activity classes were carried out with two fingerprints in two compound source databases, and the results were analyzed in histograms. Tanimoto coefficient (Tc) value ranges where active compounds were detected were compared to the distribution of Tc values in the database. The analysis revealed that compound class-specific effects strongly influenced the outcome of these fingerprint calculations. Among the five diverse compound sets studied, very different search results were obtained. The analysis described here can be applied to determine Tc intervals where scaffold hopping occurs. It can also be used to benchmark fingerprint calculations or estimate their probability of success.  相似文献   

7.
From a medicinal chemistry point of view, one of the primary goals of high throughput screening (HTS) hit list assessment is the identification of chemotypes with an informative structure-activity relationship (SAR). Such chemotypes may enable optimization of the primary potency, as well as selectivity and phamacokinetic properties. A common way to prioritize them is molecular clustering of the hits. Typical clustering techniques, however, rely on a general notion of chemical similarity or standard rules of scaffold decomposition and are thus insensitive to molecular features that are enriched in biologically active compounds. This hinders SAR analysis, because compounds sharing the same pharmacophore might not end up in the same cluster and thus are not directly compared to each other by the medicinal chemist. Similarly, common chemotypes that are not related to activity may contaminate clusters, distracting from important chemical motifs. We combined molecular similarity and Bayesian models and introduce (I) a robust, activity-aware clustering approach and (II) a feature mapping method for the elucidation of distinct SAR determinants in polypharmacologic compounds. We evaluated the method on 462 dose-response assays from the Pubchem Bioassay repository. Activity-aware clustering grouped compounds sharing molecular cores that were specific for the target or pathway at hand, rather than grouping inactive scaffolds commonly found in compound series. Many of these core structures we also found in literature that discussed SARs of the respective targets. A numerical comparison of cores allowed for identification of the structural prerequisites for polypharmacology, i.e., distinct bioactive regions within a single compound, and pointed toward selectivity-conferring medchem strategies. The method presented here is generally applicable to any type of activity data and may help bridge the gap between hit list assessment and designing a medchem strategy.  相似文献   

8.
9.
The main goal of high-throughput screening (HTS) is to identify active chemical series rather than just individual active compounds. In light of this goal, a new method (called compound set enrichment) to identify active chemical series from primary screening data is proposed. The method employs the scaffold tree compound classification in conjunction with the Kolmogorov-Smirnov statistic to assess the overall activity of a compound scaffold. The application of this method to seven PubChem data sets (containing between 9389 and 263679 molecules) is presented, and the ability of this method to identify compound classes with only weakly active compounds (potentially latent hits) is demonstrated. The analysis presented here shows how methods based on an activity cutoff can distort activity information, leading to the incorrect activity assignment of compound series. These results suggest that this method might have utility in the rational selection of active classes of compounds (and not just individual active compounds) for followup and validation.  相似文献   

10.
Herein, we report the discovery of the first potent and selective inhibitor of TRPV6, a calcium channel overexpressed in breast and prostate cancer, and its use to test the effect of blocking TRPV6‐mediated Ca2+‐influx on cell growth. The inhibitor was discovered through a computational method, xLOS, a 3D‐shape and pharmacophore similarity algorithm, a type of ligand‐based virtual screening (LBVS) method described briefly here. Starting with a single weakly active seed molecule, two successive rounds of LBVS followed by optimization by chemical synthesis led to a selective molecule with 0.3 μM inhibition of TRPV6. The ability of xLOS to identify different scaffolds early in LBVS was essential to success. The xLOS method may be generally useful to develop tool compounds for poorly characterized targets.  相似文献   

11.
Dual and triple activity-difference (DAD/TAD) maps are tools for the systematic characterization of structure-activity relationships (SAR) of compound data sets screened against two or three targets. DAD and TAD maps are two- and three- dimensional representations of the pairwise activity differences of compound data sets, respectively. Adding pairwise structural similarity information into these maps readily reveals activity cliff regions in the SAR for one, two, or three targets. In addition, pairs of compounds in the smooth regions of the SAR and scaffold hops are also easily identified in these maps. Herein, DAD and TAD maps are employed for the systematic characterization of the SAR of a benchmark set of 299 compounds screened against dopamine, norepinephrine, and serotonin transporters. To reduce the well-known dependence of the activity landscape on the structural representation, five selected 2D and 3D structure representations were used to characterize the SAR. Systematic analysis of the DAD and TAD maps reveals regions in the landscape with similar SAR for two or the three targets as well as regions with inverse SAR, i.e., changes in structure that increase activity for one target, but decrease activity for the other target. Focusing the analysis on pairs of compounds with high structure similarity revealed the presence of single-, dual-, and triple-target activity cliffs, i.e., small changes in structure with high changes in potency for one, two, or the three targets, respectively. Triple-target scaffold hops are also discussed. Activity cliffs and scaffold hops were also quantified and represented using two recently proposed approaches namely, mean Structure Activity Landscape Index (mean SALI) and Consensus Structure-Activity Similarity (SAS) maps.  相似文献   

12.
13.
Canonical transient receptor potential-5 (TRPC5), which belongs to the subfamily of transient receptor potential (TRP) channels, is a non-selective cation channel mainly expressed in the central nervous system and shows more restricted expression in the periphery. TRPC5 plays a crucial role in human physiology and pathology, for instance, anxiety, depression, epilepsy, pain, memory and chronic kidney disease (CKD). However, due to lack of the effective and selective inhibitors, its physiological and pathological mechanism remains so far unknown. It is therefore pivotal to identify potential TRPC5 inhibitors. We have applied ligand-based virtual screening (LBVS) and structure-based virtual screening (SBVS) methods. The pharmacophore models of TRPC5 antagonists generated by using the HypoGen and HipHop algorithms were used as a query model for the screening of potential inhibitors against the Specs database. The resultant hits from LBVS were further screened by SBVS. SBVS was carried out based on the homology model generation of human TRPC5, binding site identification, molecular dynamics optimization and molecular docking studies. In our systematic screening approaches, we have identified 7 hits compounds with comparable dock score after Lipinski and Veber rules, ADMET, PAINS analysis, cluster analysis, and similarity analysis. In conclusion, the current research provides novel backbones for the new-generation of TRPC5 inhibitors.  相似文献   

14.
15.
Activity cliffs are formed by pairs or groups of structurally similar compounds with significant differences in potency. They represent a prominent feature of activity landscapes of compound data sets and a primary source of structure–activity relationship (SAR) information. Thus far, activity cliffs have only been considered for active compounds, consistent with the principles of the activity landscape concept. However, from an SAR perspective, pairs formed by structurally similar active and inactive compounds should often also be informative. Therefore, we have extended the activity cliff concept to also take inactive compounds into consideration. As source of both confirmed active and inactive compounds, we have exclusively focused on PubChem confirmatory bioassays. Activity cliffs formed between pairs of active compounds (homogeneous pairs) and pairs of active and inactive compounds (heterogeneous pairs) were systematically analyzed on a per-assay basis, hence ensuring the currently highest possible degree of experimental consistency in activity measurement. Only very small numbers of large-magnitude activity cliffs formed between active compounds were detected in PubChem bioassays. However, when taking confirmed inactive compounds from confirmatory assays into account, the activity cliff frequency in assay data significantly increased, involving 11–15 % of all qualifying pairs of similar compounds, depending on the molecular representations that were used. Hence, these non-conventional activity cliffs provide an additional source of SAR information.  相似文献   

16.
Identification of novel compound classes for a drug target is a challenging task for cheminformatics and drug design when considerable research has already been undertaken and many potent lead structures have been identified, which leaves limited unclaimed chemical space for innovation. We validated and successfully applied different state-of-the-art techniques for virtual screening (Bayesian machine learning, automated molecular docking, pharmacophore search, pharmacophore QSAR and shape analysis) of 4.6 million unique and readily available chemical structures to identify promising new and competitive antagonists of the strychnine-insensitive Glycine binding site (GlycineB site) of the NMDA receptor. The novelty of the identified virtual hits was assessed by scaffold analysis, putting a strong emphasis on novelty detection. The resulting hits were tested in vitro and several novel, active compounds were identified. While the majority of the computational methods tested were able to partially discriminate actives from structurally similar decoy molecules, the methods differed substantially in their prospective applicability in terms of novelty detection. The results demonstrate that although there is no single best computational method, it is most worthwhile to follow this concept of focused compound library design and screening, as there still can new bioactive compounds be found that possess hitherto unexplored scaffolds and interesting variations of known chemotypes.  相似文献   

17.
Influenza virus endonuclease is an attractive target for antiviral therapy in the treatment of influenza infection. The purpos e of this study is to design a novel antiviral agent with improved biological activities against the influenza virus endonuclease. In this study, chemical feature‐based 3D pharmacophore models were developed from 41 known influenza virus endonuclease inhibitors. The best quantitative pharmacohore model (Hypo 1), which consists of two hydrogen‐bond acceptors and two hydrophobic features, yields the highest correlation coefficient (R = 0.886). Hypo 1 was further validated by the cross validation method and the test set compounds. The application of this model for predicting the activities of 11 known influenza virus endonuclease inhibitors in the test set shows great success. The correlation coefficient of 0.942 and a cross validation of 95;% confidence level prove that this model is reliable in identifying structurally diverse compounds for influenza virus endonuclease inhibition. The most active compound (compound 1) from the training set was docked into the active site of the influenza virus endonuclease as an additional verification that the pharmacophore model is accurate. The docked conformation showed important hydrogen bond interactions between the compound and two amino acids, Lys 134 and Lys 137. After validation, this model was used to screen the NCI chemical database to identify new influenza virus endonuclease inhibitors. Our study shows that the to pranking compound out of the 10 newly identified compounds using fit value ranking has an estimated activity of 0.049 μM. These newly identified lead compounds can be further experimentally validated using in vitro techniques.  相似文献   

18.
Chemical libraries contain thousands of compounds that need screening, which increases the need for computational methods that can rank or prioritize compounds. The tools of virtual screening are widely exploited to enhance the cost effectiveness of lead drug discovery programs by ranking chemical compounds databases in decreasing probability of biological activity based upon probability ranking principle (PRP). In this paper, we developed a novel ranking approach for molecular compounds inspired by quantum mechanics, called quantum probability ranking principle (QPRP). The QPRP ranking criteria would make an attempt to draw an analogy between the physical experiment and molecular structure ranking process for 2D fingerprints in ligand based virtual screening (LBVS). The development of QPRP criteria in LBVS has employed the concepts of quantum at three different levels, firstly at representation level, this model makes an effort to develop a new framework of molecular representation by connecting the molecular compounds with mathematical quantum space. Secondly, estimate the similarity between chemical libraries and references based on quantum-based similarity searching method. Finally, rank the molecules using QPRP approach. Simulated virtual screening experiments with MDL drug data report (MDDR) data sets showed that QPRP outperformed the classical ranking principle (PRP) for molecular chemical compounds.  相似文献   

19.
In pharmaceutical research, collections of active compounds directed against specific therapeutic targets usually evolve over time. Small molecule discovery is an iterative process. New compounds are discovered, alternative compound series explored, some series discontinued, and others prioritized. The design of new compounds usually takes into consideration prior chemical and structure-activity relationship (SAR) knowledge. Hence, historically grown compound collections represent a viable source of chemical and SAR information that might be utilized to retrospectively analyze roadblocks in compound optimization and further guide discovery projects. However, SAR analysis of large and heterogeneous sets of active compounds is also principally complicated. We have subjected evolving compound data sets to SAR monitoring using activity landscape models in order to evaluate how composition and SAR characteristics might change over time. Chemotype and potency distributions in evolving data sets directed against different therapeutic targets were analyzed and alternative activity landscape representations generated at different points in time to monitor the progression of global and local SAR features. Our results show that the evolving data sets studied here have predominantly grown around seed clusters of active compounds that often emerged early on, while other SAR islands remained largely unexplored. Moreover, increasing scaffold diversity in evolving data sets did not necessarily yield new SAR patterns, indicating a rather significant influence of "me-too-ism" (i.e., introducing new chemotypes that are similar to already known ones) on the composition and SAR information content of the data sets.  相似文献   

20.
Ideally, a team of biologists, medicinal chemists and information specialists will evaluate the hits from high throughput screening. In practice, it often falls to nonmedicinal chemists to make the initial evaluation of HTS hits. Chemical genetics and high content screening both rely on screening in cells or animals where the biological target may not be known. There is a need to place active compounds into a context to suggest potential biological mechanisms. Our idea is to build an operating environment to help the biologist make the initial evaluation of HTS data. To this end the operating environment provides viewing of compound structure files, computation of basic biologically relevant chemical properties and searching against biologically annotated chemical structure databases. The benefit is to help the nonmedicinal chemist, biologist and statistician put compounds into a potentially informative biological context. Although there are several similar public and private programs used in the pharmaceutical industry to help evaluate hits, these programs are often built for computational chemists. Our program is designed for use by biologists and statisticians.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号