首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
It is a difficult task to recognize the trends in molecular physical properties relevant to a specific chemical class and find a way to optimize potential compounds. We present here a novel hierarchical data visualization technique, named "HeiankyoView", to visualize large-scale multidimensional chemical information. HeiankyoView represents hierarchically organized data objects by mapping leaf nodes as colored square icons and nonleaf nodes as rectangular borders. In this way, data objects can be expressed as equishaped icons without overlapping one another in the two-dimensional display space. HeiankyoView has been applied to visualize aqueous solubility data for 908 compounds collected from the published literature. When the results of a recursive partitioning analysis and hierarchical clustering analysis were visualized, the trends hidden in the solubility data could be effectively displayed as intuitively understandable visual images. Most interestingly, the data visualization technique, without any statistical computations, was able to assist us in extracting from such large-scale data meaningful information establishing that ClogP and the molecular weight are critical factors in determining aqueous solubility. Thus, HeiankyoView is a powerful tool to help us understand structure-activity relationships intuitively from a large-scale data set.  相似文献   

2.
Information on CYP-chemical interactions was comprehensively explored by a text-mining technique, to confirm our previous structure-activity relationship model for CYP substrates (Yamashita et al. J. Chem. Inf. Model. 2008, 48, 364-369). The text-mining technique is based on natural language processing and can extract chemical names and their interaction patterns according to sentence context. After chemicals were automatically extracted and classified into CYP substrates, inhibitors, and inducers, 709 substrates were retrieved from the PubChem database and categorized as 216, 145, 136, 217, 156, and 379 substrates for CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4, respectively. Although the previous classification model was developed using data from only 161 compounds, the model classified the substrates found by text-mining analysis with reasonable accuracy. This confirmed the validity of both the multi-objective classification model for CYP substrates and the text-mining procedure.  相似文献   

3.
Multidimensional compound optimization is a new paradigm in the drug discovery process, yielding efficiencies during early stages and reducing attrition in the later stages of drug development. The success of this strategy relies heavily on understanding this multidimensional data and extracting useful information from it. This paper demonstrates how principled visualization algorithms can be used to understand and explore a large data set created in the early stages of drug discovery. The experiments presented are performed on a real-world data set comprising biological activity data and some whole-molecular physicochemical properties. Data visualization is a popular way of presenting complex data in a simpler form. We have applied powerful principled visualization methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), to help the domain experts (screening scientists, chemists, biologists, etc.) understand and draw meaningful decisions. We also benchmark these principled methods against relatively better known visualization approaches, principal component analysis (PCA), Sammon's mapping, and self-organizing maps (SOMs), to demonstrate their enhanced power to help the user visualize the large multidimensional data sets one has to deal with during the early stages of the drug discovery process. The results reported clearly show that the GTM and HGTM algorithms allow the user to cluster active compounds for different targets and understand them better than the benchmarks. An interactive software tool supporting these visualization algorithms was provided to the domain experts. The tool facilitates the domain experts by exploration of the projection obtained from the visualization algorithms providing facilities such as parallel coordinate plots, magnification factors, directional curvatures, and integration with industry standard software.  相似文献   

4.
We report the QSAR modeling of cytochrome P450 3A4 (CYP3A4) enzyme inhibition using four large data sets of in vitro data. These data sets consist of marketed drugs and drug-like compounds all tested in four assays measuring the inhibition of the metabolism of four different substrates by the CYP3A4 enzyme. The four probe substrates are benzyloxycoumarin, testosterone, benzyloxyresorufin, and midazolam. We first show that using state-of-the-art QSAR modeling approaches applied to only one of these four data sets does not lead to predictive models that would be useful for in silico filtering of chemical libraries. We then present the development and the testing of a multiple pharmacophore hypothesis (MPH) that is formulated as a conceptual extension of the traditional QSAR approach to modeling the promiscuous binding of a large variety of drugs to CYP3A4. In the simplest form, the MPH approach takes advantage of the multiple substrate data sets and identifies the binding of test compounds as either proximal or distal relative to that of a given substrate. Application of the approach to the in silico filtering of test compounds for potential inhibitors of CYP3A4 is also presented. In addition to an improvement in the QSAR modeling for the inhibition of CYP3A4, the results from this modeling approach provide structural insights into the drug-enzyme interactions. The existence of multiple inhibition data sets in the BioPrint database motivates the original development of the concept of a multiple pharmacophore hypothesis and provides a unique opportunity for formulating alternative strategies of QSAR modeling of the inhibition of the in vitro metabolism of CYP3A4.  相似文献   

5.
Statistical learning methods have been used in developing filters for predicting inhibitors of two P450 isoenzymes, CYP3A4 and CYP2D6. This work explores the use of different statistical learning methods for predicting inhibitors of these enzymes and an additional P450 enzyme, CYP2C9, and the substrates of the three P450 isoenzymes. Two consensus support vector machine (CSVM) methods, "positive majority" (PM-CSVM) and "positive probability" (PP-CSVM), were used in this work. These methods were first tested for the prediction of inhibitors of CYP3A4 and CYP2D6 by using a significantly higher number of inhibitors and noninhibitors than that used in earlier studies. They were then applied to the prediction of inhibitors of CYP2C9 and substrates of the three enzymes. Both methods predict inhibitors of CYP3A4 and CYP2D6 at a similar level of accuracy as those of earlier studies. For classification of inhibitors of CYP2C9, the best CSVM method gives an accuracy of 88.9% for inhibitors and 96.3% for noninhibitors. The accuracies for classification of substrates and nonsubstrates of CYP3A4, CYP2D6, and CYP2C9 are 98.2 and 90.9%, 96.6 and 94.4%, and 85.7 and 98.8%, respectively. Both CSVM methods are potentially useful as filters for predicting inhibitors and substrates of P450 isoenzymes. These methods generally give better accuracies than single SVM classification systems, and the performance of the PP-CSVM method is slightly better than that of the PM-CSVM method.  相似文献   

6.
7.
8.
A generic method employing ultrafast liquid chromatography with tandem mass spectrometry (LC/MS/MS) was developed and employed for routine screening of drug candidates for inhibition of five major human cytochrome p450 (CYP) isozymes, CYP3A4, CYP2D6, CYP2C9, CYP2C19, and CYP1A2. The method utilized a monolithic silica rod column to allow fast flow rates to significantly reduce chromatographic run time. The major metabolites of six CYP-specific probe substrates for the five p450 isoforms were monitored and quantified to determine IC(50) values of five drug compounds against each p450 isozyme. Human liver microsomal incubation samples at each test compound concentration were combined and analyzed simultaneously by the LC/MS/MS method. Each pooled sample containing six substrates and an internal standard was separated and detected in only 24 seconds. The combination of ultrafast chromatography and sample pooling techniques has significantly increased sample throughput and shortened assay turnaround time, allowing a large number of compounds to be screened rapidly for potential p450 inhibitory activity, to aid in compound selection and optimization in drug discovery.  相似文献   

9.
The tremendous increase in chemical structure and biological activity data brought about through combinatorial chemistry and high-throughput screening technologies has created the need for sophisticated graphical tools for visualizing and exploring structure-activity data. Visualization plays an important role in exploring and understanding relationships within such multidimensional data sets. Many chemoinformatics software applications apply standard clustering techniques to organize structure-activity data, but they differ significantly in their approaches to visualizing clustered data. Molecular Property eXplorer (MPX) is unique in its presentation of clustered data in the form of heatmaps and tree-maps. MPX employs agglomerative hierarchical clustering to organize data on the basis of the similarity between 2D chemical structures or similarity across a predefined profile of biological assay values. Visualization of hierarchical clusters as tree-maps and heatmaps provides simultaneous representation of cluster members along with their associated assay values. Tree-maps convey both the spatial relationship among cluster members and the value of a single property (activity) associated with each member. Heatmaps provide visualization of the cluster members across an activity profile. Unlike a tree-map, however, a heatmap does not convey the spatial relationship between cluster members. MPX seamlessly integrates tree-maps and heatmaps to represent multidimensional structure-activity data in a visually intuitive manner. In addition, MPX provides tools for clustering data on the basis of chemical structure or activity profile, displaying 2D chemical structures, and querying the data based over a specified activity range, or set of chemical structure criteria (e.g., Tanimoto similarity, substructure match, and "R-group" analysis).  相似文献   

10.
11.
Quantitative Structure Activity Relationship (QSAR) is a term describing a variety of approaches that are of substantial interest for chemistry. This method can be defined as indirect molecular design by the iterative sampling of the chemical compounds space to optimize a certain property and thus indirectly design the molecular structure having this property. However, modeling the interactions of chemical molecules in biological systems provides highly noisy data, which make predictions a roulette risk. In this paper we briefly review the origins for this noise, particularly in multidimensional QSAR. This was classified as the data, superimposition, molecular similarity, conformational, and molecular recognition noise. We also indicated possible robust answers that can improve modeling and predictive ability of QSAR, especially the self-organizing mapping of molecular objects, in particular, the molecular surfaces, a method that was brought into chemistry by Gasteiger and Zupan.  相似文献   

12.
13.
14.
Cytochrome P450 (CYP) 3A4, 2D6, 2C9, 2C19, and 1A2 are the most important drug-metabolizing enzymes in the human liver. Knowledge of which parts of a drug molecule are subject to metabolic reactions catalyzed by these enzymes is crucial for rational drug design to mitigate ADME/toxicity issues. SMARTCyp, a recently developed 2D ligand structure-based method, is able to predict site-specific metabolic reactivity of CYP3A4 and CYP2D6 substrates with an accuracy that rivals the best and more computationally demanding 3D structure-based methods. In this article, the SMARTCyp approach was extended to predict the metabolic hotspots for CYP2C9, CYP2C19, and CYP1A2 substrates. This was accomplished by taking into account the impact of a key substrate-receptor recognition feature of each enzyme as a correction term to the SMARTCyp reactivity. The corrected reactivity was then used to rank order the likely sites of CYP-mediated metabolic reactions. For 60 CYP1A2 substrates, the observed major sites of CYP1A2 catalyzed metabolic reactions were among the top-ranked 1, 2, and 3 positions in 67%, 80%, and 83% of the cases, respectively. The results were similar to those obtained by MetaSite and the reactivity + docking approach. For 70 CYP2C9 substrates, the observed sites of CYP2C9 metabolism were among the top-ranked 1, 2, and 3 positions in 66%, 86%, and 87% of the cases, respectively. These results were better than the corresponding results of StarDrop version 5.0, which were 61%, 73%, and 77%, respectively. For 36 compounds metabolized by CYP2C19, the observed sites of metabolism were found to be among the top-ranked 1, 2, and 3 sites in 78%, 89%, and 94% of the cases, respectively. The computational procedure was implemented as an extension to the program SMARTCyp 2.0. With the extension, the program can now predict the site of metabolism for all five major drug-metabolizing enzymes with an accuracy similar to or better than that achieved by the best 3D structure-based methods. Both the Java source code and the binary executable of the program are freely available to interested users.  相似文献   

15.
Problems of pattern recognition in chemistry and other subjects can be divided conveniently into four different types depending on the level of scope of the problem.(1) Classification into one of a number of defined classes. As an example blood samples taken from persons known to be either controls or welders are considered. The problem is whether trace element concentrations in these samples contain information on whether or not a person is a welder.(2) Level 1 plus the possibility that an object is an outlier, i.e. does not belong to any of the defined classes. As an example, the üse of 13C-n.m.r. data to decide whether 2-substituted norbornanes have the exo or endo structure is discussed. (2A) Level 2, asymmetric. This situation occurs when one class does not have a systematic structure, but another class is homogeneous and can be described by a level 2 model. This occurs in the classification of materials or compounds as good or bad, active or inactive, and in binary classifications. As an example the use of trace element data to classify steel samples as having good or poor properties of strength is discussed.(3) Level 2 plus the ability to relate the variables measured to external properties of continuous character. As an example, the classification of a series of chemical compounds as β -receptor blockers, β -receptor stimulants, or neither, on the basis of their structural variables is discussed. In addition, relations between these structural variables and the measured biological activity are sought within each of the two classes.(4) Level 3 with the difference that several external property variables in the objects are measured. It may be desirable to use variables of the objects both for classification and for relations to several property variables: such examples are numerous in analytical chemistry.  相似文献   

16.
A kinetic, reactivity-binding model has been proposed to predict the regioselectivity of substrates meditated by the CYP1A2 enzyme, which is responsible for the metabolism of planar-conjugated compounds such as caffeine. This model consists of a docking simulation for binding energy and a semiempirical molecular orbital calculation for activation energy. Possible binding modes of CYP1A2 substrates were first examined using automated docking based on the crystal structure of CYP1A2, and binding energy was calculated. Then, activation energies for CYP1A2-mediated metabolism reactions were calculated using the semiempirical molecular orbital calculation, AM1. Finally, the metabolic probability obtained from two energy terms, binding and activation energies, was used for predicting the most probable metabolic site. This model predicted 8 out of 12 substrates accurately as the primary preferred site among all possible metabolic sites, and the other four substrates were predicted into the secondary preferred site. This method can be applied for qualitative prediction of drug metabolism mediated by CYP1A2 and other CYP450 family enzymes, helping to develop drugs efficiently.  相似文献   

17.
Metabolite identification study plays an important role in determining the sites of metabolic liability of new chemical entities (NCEs) in drug discovery for lead optimization. Here we compare the two predictive software, MetaSite and StarDrop, available for this purpose. They work very differently but are used to predict the site of oxidation by major human cytochrome P450 (CYP) isoforms. Neither software can predict non-CYP catalyzed metabolism nor the rates of metabolism. For the purpose of comparing the two software packages, we tested known probe substrate for these enzymes, which included 12 substrates of CYP3A4 and 18 substrates of CYP2C9 and CYP2D6 were analyzed by each software and the results were compared. It is possible that these known substrates were part of the training set but we are not aware of it. To assess the performance of each software we assigned a point system for each correct prediction. The total points assigned for each CYP isoform experimentally were compared as a percentage of the total points assigned theoretically for the first choice prediction for all substrates for each isoform. Our results show that MetaSite and StarDrop are similar in predicting the correct site of metabolism by CYP3A4 (78% vs 83%, respectively). StarDrop appears to do slightly better in predicting the correct site of metabolism by CYP2C9 and CYP2D6 metabolism (89% and 93%, respectively) compared to MetaSite (63% and 70%, respectively). The sites of metabolism (SOM) from 34 in-house NCEs incubated in human liver microsomes or human hepatocytes were also evaluated using two prediction software packages and the results showed comparable SOM predictions. What makes this comparison challenging is that the contribution of each isoform to the intrinsic clearance (Clint) is not known. Overall the software were comparable except for MetaSite performing better for CYP2D6 and that MetaSite has a liver model that is absent in StarDrop that predicted with 82% accuracy.  相似文献   

18.
A new method has been developed to design a focused library based on available active compounds using protein-compound docking simulations. This method was applied to the design of a focused library for cytochrome P450 (CYP) ligands, not only to distinguish CYP ligands from other compounds but also to identify the putative ligands for a particular CYP. Principal component analysis (PCA) was applied to the protein-compound affinity matrix, which was obtained by thorough docking calculations between a large set of protein pockets and chemical compounds. Each compound was depicted as a point in the PCA space. Compounds that were close to the known active compounds were selected as candidate hit compounds. A machine-learning technique optimized the docking scores of the protein-compound affinity matrix to maximize the database enrichment of the known active compounds, providing an optimized focused library.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号