首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
String comparison techniques were developed and applied for measuring the molecular similarity of chemical structures. The molecular structures were encoded as a sequence of numbers representing counts of paths of different lengths. The similarity index between two compounds was calculated as the difference between the gains of information derived through comparison of the corresponding molecular path sequences. Ranks between the structures of the studied data base obtained according to this similarity were used as basic data for deriving correspondences between the elements of the set of compounds. The method was applied on a group of 41 barbiturates. Correlation equations were calculated for different groups of compounds grouped according to the displayed similarity. The correlation equations and the corresponding statistics were obtained using standard computer programs. Special algorithm for computing the similarity index and the correlation matrix (outlined very briefly) was developed and implemented on VAX 11/750.  相似文献   

2.
A new method, using a combination of 4D-molecular similarity measures and cluster analysis to construct optimum QSAR models, is applied to a data set of 150 chemically diverse compounds to build optimum blood-brain barrier (BBB) penetration models. The complete data set is divided into subsets based on 4D-molecular similarity measures using cluster analysis. The compounds in each cluster subset are further divided into a training set and a test set. Predictive QASAR models are constructed for each cluster subset using the corresponding training sets. These QSAR models best predict test set compounds which are assigned to the same cluster subset, based on the 4D-molecular similarity measures, from which the models are derived. The results suggest that the specific properties governing blood-brain barrier permeability may vary across chemically diverse compounds. Partitioning compounds into chemically similar classes is essential to constructing predictive blood-brain barrier penetration models embedding the corresponding key physiochemical properties of a given chemical class.  相似文献   

3.
The development of new antibacterial drugs has become one of the most important tasks of the century in order to overcome the posing threat of drug resistance in pathogenic bacteria. Many antibiotics originate from natural products produced by various microorganisms. Over the last decades, bioinformatical approaches have facilitated the discovery and characterization of these small compounds using genome mining methodologies. A key part of this process is the identification of the most promising biosynthetic gene clusters (BGCs), which encode novel natural products. In 2017, the Antibiotic Resistant Target Seeker (ARTS) was developed in order to enable an automated target-directed genome mining approach. ARTS identifies possible resistant target genes within antibiotic gene clusters, in order to detect promising BGCs encoding antibiotics with novel modes of action. Although ARTS can predict promising targets based on multiple criteria, it provides little information about the cluster structures of possible resistant genes. Here, we present SYN-view. Based on a phylogenetic approach, SYN-view allows for easy comparison of gene clusters of interest and distinguishing genes with regular housekeeping functions from genes functioning as antibiotic resistant targets. Our aim is to implement our proposed method into the ARTS web-server, further improving the target-directed genome mining strategy of the ARTS pipeline.  相似文献   

4.
5.
In this study we evaluate how far the scope of similarity searching can be extended to identify not only ligands binding to the same target as the reference ligand(s) but also ligands of other homologous targets without initially known ligands. This "homology-based similarity searching" requires molecular representations reflecting the ability of a molecule to interact with target proteins. The Similog keys, which are introduced here as a new molecular representation, were designed to fulfill such requirements. They are based only on the molecular constitution and are counts of atom triplets. Each triplet is characterized by the graph distances and the types of its atoms. The atom-typing scheme classifies each atom by its function as H-bond donor or acceptor and by its electronegativity and bulkiness. In this study the Similog keys are investigated in retrospective in silico screening experiments and compared with other conformation independent molecular representations. Studied were molecules of the MDDR database for which the activity data was augmented by standardized target classification information from public protein classification databases. The MDDR molecule set was split randomly into two halves. The first half formed the candidate set. Ligands of four targets (dopamine D2 receptor, opioid delta-receptor, factor Xa serine protease, and progesterone receptor) were taken from the second half to form the respective reference sets. Different similarity calculation methods are used to rank the molecules of the candidate set by their similarity to each of the four reference sets. The accumulated counts of molecules binding to the reference target and groups of targets with decreasing homology to it were examined as a function of the similarity rank for each reference set and similarity method. In summary, similarity searching based on Unity 2D-fingerprints or Similog keys are found to be equally effective in the identification of molecules binding to the same target as the reference set. However, the application of the Similog keys is more effective in comparison with the other investigated methods in the identification of ligands binding to any target belonging to the same family as the reference target. We attribute this superiority to the fact that the Similog keys provide a generalization of the chemical elements and that the keys are counted instead of merely noting their presence or absence in a binary form. The second most effective molecular representation are the occurrence counts of the public ISIS key fragments, which like the Similog method, incorporates key counting as well as a generalization of the chemical elements. The results obtained suggest that ligands for a new target can be identified by the following three-step procedure: 1. Select at least one target with known ligands which is homologous to the new target. 2. Combine the known ligands of the selected target(s) to a reference set. 3. Search candidate ligands for the new targets by their similarity to the reference set using the Similog method. This clearly enlarges the scope of similarity searching from the classical application for a single target to the identification of candidate ligands for whole target families and is expected to be of key utility for further systematic chemogenomics exploration of previously well explored target families.  相似文献   

6.
3D-QSAR uses statistical techniques to correlate calculated structural properties with target properties like biological activity. The comparison of calculated structural properties is dependent upon the relative orientations of molecules in a given data set. Typically molecules are aligned by performing an overlap of common structural units. This “alignment rule” is adequate for a data set, that is closely related structurally, but is far more difficult to apply to either a diverse data set or on the basis of some structural property other than shape, even for sterically similar molecules. In this work we describe a new algorithm for molecular alignment based upon optimization of molecular similarity indices. We show that this Monte Carlo based algorithm is more effective and robust than other optimizers applied previously to the similarity based alignment problem. We show that QSARs derived using the alignments generated by our algorithm are superior to QSARs derived using the more common alignment of fitting of common structural units. © 1997 by John Wiley & Sons, Inc. J Comput Chem 18 : 1344–1353, 1997  相似文献   

7.
The similarity of drug targets is typically measured using sequence or structural information. Here, we consider chemo-centric approaches that measure target similarity on the basis of their ligands, asking how chemoinformatics similarities differ from those derived bioinformatically, how stable the ligand networks are to changes in chemoinformatics metrics, and which network is the most reliable for prediction of pharmacology. We calculated the similarities between hundreds of drug targets and their ligands and mapped the relationship between them in a formal network. Bioinformatics networks were based on the BLAST similarity between sequences, while chemoinformatics networks were based on the ligand-set similarities calculated with either the Similarity Ensemble Approach (SEA) or a method derived from Bayesian statistics. By multiple criteria, bioinformatics and chemoinformatics networks differed substantially, and only occasionally did a high sequence similarity correspond to a high ligand-set similarity. In contrast, the chemoinformatics networks were stable to the method used to calculate the ligand-set similarities and to the chemical representation of the ligands. Also, the chemoinformatics networks were more natural and more organized, by network theory, than their bioinformatics counterparts: ligand-based networks were found to be small-world and broad-scale.  相似文献   

8.
Molecular fingerprints are widely used for similarity-based virtual screening in drug discovery projects. In this paper we discuss the performance and the complementarity of nine two-dimensional fingerprints (Daylight, Unity, AlFi, Hologram, CATS, TRUST, Molprint 2D, ChemGPS, and ALOGP) in retrieving active molecules by similarity searching against a set of query compounds. For this purpose, we used biological data from HTS screening campaigns of four protein families (GPCRs, kinases, ion channels, and proteases). We have established threshold values for the similarity index (Tanimoto index) to be used as starting points for similarity searches. Based on the complementarities between the selections made by using different fingerprints we propose a multifingerprint approach as an efficient tool to balance the strengths and weaknesses of various fingerprints.  相似文献   

9.
In this work, we calculated the pair wise chemical similarity for a subset of small molecules screened against the NCI60 cancer cell line panel. Four different compound similarity calculation methods were used: Brutus, GRIND, Daylight and UNITY. The chemical similarity scores of each method were related to the biological similarity data set. The same was done also for combinations of methods. In the end, we had an estimate of biological similarity for a given chemical similarity score or combinations thereof. The data from above was used to identify chemical similarity ranges where combining two or more methods (data fusion) led to synergy. The results were also applied in ligand-based virtual screening using the DUD data set. In respect to their ability to enrich biologically similar compound pairs, the ranking of the four methods in descending performance is UNITY, Daylight, Brutus and GRIND. Combining methods resulted always in positive synergy within a restricted range of chemical similarity scores. We observed no negative synergy. We also noted that combining three or four methods had only limited added advantage compared to combining just two. In the virtual screening, using the estimated biological similarity for ranking compounds produced more consistent results than using the methods in isolation.  相似文献   

10.
11.
In Part I of this work, we developed a method for the detection of drugs of abuse in biological samples based on fast gradient elution liquid-chromatography coupled with diode array spectroscopic detection (LC-DAD). In this part of the work, we apply the chemometric method of target factor analysis (TFA) to the chromatograms. This algorithm identifies the target compounds present in chromatograms based on a spectral library, resolves nearly co-eluting components, and differentiates between drugs with similar spectra. The ability to resolve highly overlapped peaks using the spectral data afforded by the DAD is what distinguishes the present method from conventional library searching methods. Our library has a mean list length (MLL) of 1.255 and a discriminating power of 0.997 when both retention index and spectral factors are considered. The algorithm compares a library of 47 different compounds of toxicological relevance to unknown samples and identifies which compounds are present based on spectral and retention index matching. The application of a corrected retention index for identification rather than raw retention times compensates for long-term and column-to-column retention time shifts and allows for the use of a single library of spectral and retention data. Training data sets were used to establish the search and identification parameters of the method. A validation data set of 70 chromatograms was used to calculate the sensitivity (correct identification of positives) and specificity (correct identification of negatives) of the method, which were found to be 92% and 94%, respectively.  相似文献   

12.
13.
14.
15.
16.
17.
The activity of a biological compound is dependent both on specific binding to a target receptor and its ADME (Absorption, Distribution, Metabolism, Excretion) properties. A challenge to predict biological activity is to consider both contributions simultaneously in deriving quantitative models. We present a novel approach to derive QSAR models combining similarity analysis of molecular interaction fields (MIFs) with prediction of logP and/or logD. This new classification method is applied to a set of about 100 compounds related to the auxin plant hormone. The classification based on similarity of their interaction fields is more successful for the indole than the phenoxy compounds. The classification of the phenoxy compounds is however improved by taking into account the influence of the logP and/or the logD values on biological activity. With the new combined method, the majority (8 out of 10) of the previously misclassified derivatives of phenoxy acetic acid are classified in accord with their bioassays. The recently determined crystal structure of the auxin-binding protein 1 (ABP1) enabled validation of our approach. The results of docking a few auxin related compounds with different biological activity to ABP1 correlate well with the classification based on similarity of MIFs only. Biological activity is, however, better predicted by a combined similarity of MIFs + logP/logD approach.  相似文献   

18.
分子相似性和取代苯酚pKa值的预测   总被引:1,自引:0,他引:1  
  相似文献   

19.
20.
A wide variety of computational algorithms have been developed that strive to capture the chemical similarity between two compounds for use in virtual screening and lead discovery. One limitation of such approaches is that, while a returned similarity value reflects the perceived degree of relatedness between any two compounds, there is no direct correlation between this value and the expectation or confidence that any two molecules will in fact be equally active. A lack of a common framework for interpretation of similarity measures also confounds the reliable fusion of information from different algorithms. Here, we present a probabilistic framework for interpreting similarity measures that directly correlates the similarity value to a quantitative expectation that two molecules will in fact be equipotent. The approach is based on extensive benchmarking of 10 different similarity methods (MACCS keys, Daylight fingerprints, maximum common subgraphs, rapid overlay of chemical structures (ROCS) shape similarity, and six connectivity-based fingerprints) against a database of more than 150,000 compounds with activity data against 23 protein targets. Given this unified and probabilistic framework for interpreting chemical similarity, principles derived from decision theory can then be applied to combine the evidence from different similarity measures in such a way that both capitalizes on the strengths of the individual approaches and maintains a quantitative estimate of the likelihood that any two molecules will exhibit similar biological activity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号