首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The pharmacophore concept is of central importance in computer-aided drug design (CADD) mainly because of its successful application in medicinal chemistry and, in particular, high-throughput virtual screening (HTVS). The simplicity of the pharmacophore definition enables the complexity of molecular interactions between ligand and receptor to be reduced to a handful set of features. With many pharmacophore screening softwares available, it is of the utmost interest to explore the behavior of these tools when applied to different biological systems. In this work, we present a comparative analysis of eight pharmacophore screening algorithms (Catalyst, Unity, LigandScout, Phase, Pharao, MOE, Pharmer, and POT) for their use in typical HTVS campaigns against four different biological targets by using default settings. The results herein presented show how the performance of each pharmacophore screening tool might be specifically related to factors such as the characteristics of the binding pocket, the use of specific pharmacophore features, and the use of these techniques in specific steps/contexts of the drug discovery pipeline. Algorithms with rmsd-based scoring functions are able to predict more compound poses correctly as overlay-based scoring functions. However, the ratio of correctly predicted compound poses versus incorrectly predicted poses is better for overlay-based scoring functions that also ensure better performances in compound library enrichments. While the ensemble of these observations can be used to choose the most appropriate class of algorithm for specific virtual screening projects, we remarked that pharmacophore algorithms are often equally good, and in this respect, we also analyzed how pharmacophore algorithms can be combined together in order to increase the success of hit compound identification. This study provides a valuable benchmark set for further developments in the field of pharmacophore search algorithms, e.g., by using pose predictions and compound library enrichment criteria.  相似文献   

2.
In this paper we report the first application of multivariate data analysis techniques to force spectrometry measurement sets to enable the physicochemical assignment of spatially ordered multi-component systems. Principal component analysis (PCA) and hierarchical clustering techniques were used to reveal hidden chemical information within force-distance curves generated by high spatial resolution force microscopy. Two experimental samples were analyzed: (i) a two-component system of cytochrome c proteins on a mica surface, and (ii) a three-component system of avidin protein islands positioned on a gold and glass surface. PCA and hierarchical clustering techniques were used to discriminate the different components of the two-component system, whereas hierarchical clustering was found to be superior for the three-component system. Results were in good agreement with the topography and prior knowledge of the surface patterns. This research represents a formative step towards the combination of force spectrometry with chemometric tools for the high resolution physicochemical investigation of complex biochemical systems.  相似文献   

3.
With the emergence of combinatorial chemistry, whether based on parallel, mixture, solution, or solid phase chemistry, it is now possible to generate large numbers of diverse or focused compound libraries. In this paper we aim to demonstrate that it is possible to design targeted libraries by applying nonparametric statistical methods, recursive partitioning in particular, to large data sets containing thousands of compounds and their associated biological data. Moreover, when applied to an experimental high-throughput screening (HTS) data set, our data strongly suggest that this method can improve the hit rate of our primary screens (about 4- to 5-fold) while increasing screening efficiency: less than one-fifth of the complete selection needs to be screened in order to identify about 75% of all actives present.  相似文献   

4.
5.
6.
7.
Application of chemometric methods to mass spectrometry imaging (MSI) data faces a bottleneck concerning the vast size of the experimental data sets. This drawback is critical when considering high‐resolution mass spectrometry data, which provide several thousand points for each considered pixel. In this work, different approaches have been tested to reduce the size of the analyzed data with the aim to allow the subsequent application of typical chemometric methods for image analysis. The standard approach for MSI data compression consists in binning mass spectra for each pixel to reduce the number of m/z values. In this work, a method is proposed to handle the huge size of MSI data based on the adaptation of a liquid chromatography‐mass spectrometry data compression method by the detection of regions of interest. Results showed that both approaches achieved high compression rates, although the proposed regions of interest–based method attains this reduction requiring lower computational requirements and keeping utter spectral information. For instance, typical compression rate reached values higher than 90% without loss of information in images and spectra.  相似文献   

8.
An overview of the application of chemometric data analysis methods to complex chemical mixtures in various environmental media is presented. Reviews of selected research are given as examples of the application of principal components analysis and other statistical methods to identify contributions from multiple sources of contamination in air, water, sediments, and biota. Other examples are cited that illustrate how scientists have used classification and regression methods to model the distribution of anthropogenic contaminants and predict their environmental effects or fate.  相似文献   

9.
High-throughput screening (HTS) campaigns in pharmaceutical companies have accumulated a large amount of data for several million compounds over a couple of hundred assays. Despite the general awareness that rich information is hidden inside the vast amount of data, little has been reported for a systematic data mining method that can reliably extract relevant knowledge of interest for chemists and biologists. We developed a data mining approach based on an algorithm called ontology-based pattern identification (OPI) and applied it to our in-house HTS database. We identified nearly 1500 scaffold families with statistically significant structure-HTS activity profile relationships. Among them, dozens of scaffolds were characterized as leading to artifactual results stemming from the screening technology employed, such as assay format and/or readout. Four types of compound scaffolds can be characterized based on this data mining effort: tumor cytotoxic, general toxic, potential reporter gene assay artifact, and target family specific. The OPI-based data mining approach can reliably identify compounds that are not only structurally similar but also share statistically significant biological activity profiles. Statistical tests such as Kruskal-Wallis test and analysis of variance (ANOVA) can then be applied to the discovered scaffolds for effective assignment of relevant biological information. The scaffolds identified by our HTS data mining efforts are an invaluable resource for designing SAR-robust diversity libraries, generating in silico biological annotations of compounds on a scaffold basis, and providing novel target family specific scaffolds for focused compound library design.  相似文献   

10.
The interpretation of complexity in isothermal calorimetric data is demanding. The observed power signal is a composite of the powers arising from each of the individual events occurring (which can involve physical, as well as chemical, change). The challenge, therefore, lies in deconvoluting the observed data into their component parts. Here, we discuss the potential use of chemometric analysis, because it offers the significant advantage of being model-free, using principal component analysis to deconvolute data. Using model data, we discovered that the software required a minimum trivariate data matrix to be constructed. Two variables, power and time, were available from the raw data. Selection of a third variable was more problematic, but it was found that by running multiple experiments the small variation in the number of moles of compound in each experiment was sufficient to allow a successful analysis. In general we noted that it required a minimum 2n + 2 repeat experiments to allow analysis (where n is the number of reaction processes). The data outputted from the chemometric software were of the form intensity (arbitrary units) versus time, reflecting the fact that the software was written for analysis of spectroscopic data. We provide a mathematical treatment of the data that allows recovery of both reaction enthalpy and rate constants. The study demonstrates that chemometric analysis is a promising approach for the interpretation of complex calorimetric data.  相似文献   

11.
It is demonstrated that bipolar electrochemistry can be used for high-throughput corrosion testing covering a wide potential range in one single experiment and that this, combined with rapid image analysis, constitutes a simple and convenient way to screen the corrosion behaviour of conducting materials and corrosion protective coatings. Stainless steel samples (SS304), acting as bipolar electrodes, were immersed in sulphuric and hydrochloric acid and exposed to an electric field to establish a potential gradient along the surface. In this way, the same steel sample was exposed to a wide range of cathodic and anodic conditions, ranging from potentials yielding hydrogen evolution to potentials well into the transpassive region. This wireless approach enables rapid simultaneous comparison of numerous samples, and also provides the opportunity to perform experiments on samples that are of a complex shape, or which otherwise are difficult to employ in standard electrochemical corrosion tests.  相似文献   

12.
Integration of flexible data-analysis tools with cheminformatics methods is a prerequisite for successful identification and validation of “hits” in high-throughput screening (HTS) campaigns. We have designed, developed, and implemented a suite of robust yet flexible cheminformatics tools to support HTS activities at the Broad Institute, three of which are described herein. The “hit-calling” tool allows a researcher to set a hit threshold that can be varied during downstream analysis. The results from the hit-calling exercise are reported to a database for record keeping and further data analysis. The “cherry-picking” tool enables creation of an optimized list of hits for confirmatory and follow-up assays from an HTS hit list. This tool allows filtering by computed chemical property and by substructure. In addition, similarity searches can be performed on hits of interest and sets of related compounds can be selected. The third tool, an “S/SAR viewer,” has been designed specifically for the Broad Institute’s diversity-oriented synthesis (DOS) collection. The compounds in this collection are rich in chiral centers and the full complement of all possible stereoisomers of a given compound are present in the collection. The S/SAR viewer allows rapid identification of both structure/activity relationships and stereo-structure/activity relationships present in HTS data from the DOS collection. Together, these tools enable the prioritization and analysis of hits from diverse compound collections, and enable informed decisions for follow-up biology and chemistry efforts.  相似文献   

13.
Two chemometric methods are compared for the rapid screening of comprehensive two-dimensional liquid chromatographic (LC × LC) analysis of wine. The similarity index and Fisher ratio methods were both found to be able to distinguish geographical variability and to determine potentially significant peaks for further quantitative and qualitative study. An experimental data set consisting of five different wine samples and multiple simulated data sets were analyzed in the investigation of the screening methods. Several statistical analyses are employed in the understanding and verification of the results from the similarity index and Fisher ratio methods. The sum rank difference (SRD) method was used to compare the rankings of the two different methods as applied to the different data sets and to determine the amount of variability associated with the ranking of the peak differences. The major advantage the similarity index method offers is that it is an unsupervised method; no a priori knowledge of the samples (i.e., class identification) is required, while the Fisher ratio method is supervised. Both methods are rapid and require little user intervention other than the determination of a threshold for inclusion/exclusion of compounds from further analysis.  相似文献   

14.
Molecular similarity methods for ligand-based virtual screening (VS) generally do not take compound potency as a variable or search parameter into account. We have incorporated a logarithmic potency scaling function into two conceptually distinct VS algorithms to account for relative compound potency during search calculations. A high-throughput screening (HTS) data set containing cathepsin B inhibitors was analyzed to evaluate the effects of potency scaling. Sets of template compounds were randomly selected from the HTS data and used to search for hits having varying potency levels in the presence or absence of potency scaling. Enrichment of potent compounds in small subsets of the HTS data set was observed as a consequence of potency scaling. In part, observed enrichments could be rationalized as a result of recentering chemical reference space on a subspace populated by potent compounds. Our findings suggest that VS calculations using multiple reference compounds can be directed toward the preferential detection of potent database hits by scaling compound contributions according to potency differences.  相似文献   

15.
A rapid retention time alignment algorithm was developed as a preprocessing utility to be used prior to chemometric analysis of large datasets of diesel fuel profiles obtained using gas chromatography (GC). Retention time variation from chromatogram-to-chromatogram has been a significant impediment against the use of chemometric techniques in the analysis of chromatographic data due to the inability of current chemometric techniques to correctly model information that shifts from variable to variable within a dataset. The alignment algorithm developed is shown to increase the efficacy of pattern recognition methods applied to diesel fuel chromatograms by retaining chemical selectivity while reducing chromatogram-to-chromatogram retention time variations and to do so on a time scale that makes analysis of large sets of chromatographic data practical. Two sets of diesel fuel gas chromatograms were studied using the novel alignment algorithm followed by principal component analysis (PCA). In the first study, retention times for corresponding chromatographic peaks in 60 chromatograms varied by as much as 300 ms between chromatograms before alignment. In the second study of 42 chromatograms, the retention time shifting exhibited was on the order of 10 s between corresponding chromatographic peaks, and required a coarse retention time correction prior to alignment with the algorithm. In both cases, an increase in retention time precision afforded by the algorithm was clearly visible in plots of overlaid chromatograms before and then after applying the retention time alignment algorithm. Using the alignment algorithm, the standard deviation for corresponding peak retention times following alignment was 17 ms throughout a given chromatogram, corresponding to a relative standard deviation of 0.003% at an average retention time of 8 min. This level of retention time precision is a 5-fold improvement over the retention time precision initially provided by a state-of-the-art GC instrument equipped with electronic pressure control and was critical to the performance of the chemometric analysis. This increase in retention time precision does not come at the expense of chemical selectivity, since the PCA results suggest that essentially all of the chemical selectivity is preserved. Cluster resolution between dissimilar groups of diesel fuel chromatograms in a two-dimensional scores space generated with PCA is shown to substantially increase after alignment. The alignment method is robust against missing or extra peaks relative to a target chromatogram used in the alignment, and operates at high speed, requiring roughly 1 s of computation time per GC chromatogram.  相似文献   

16.
NMR-based screening has become a powerful method for the identification and analysis of low-molecular weight organic compounds that bind to protein targets and can be utilized in drug discovery programs. In particular, heteronuclear NMR-based screening can yield information about both the affinity and binding location of potential lead compounds. In addition, heteronuclear NMR-based screening has wide applications in complementing and facilitating conventional high-throughout screening programs. This article will describe several strategies for the integration of NMR-based screening and high-throughput screening. The marriage of these two techniques promises to be of tremendous benefit in the triage of hits that come from HTS, and can aid the medicinal chemist in the identification of quality leads that have high potential for further optimization.  相似文献   

17.
A general algorithm for the prioritization and selection of plates for high-throughput screening is presented. The method uses a simulated annealing algorithm to search through the space of plate combinations for the one that maximizes some user-defined objective function. The algorithm is robust and convergent, and permits the simultaneous optimization of multiple design objectives, including molecular diversity, similarity to known actives, predicted activity or binding affinity, and many others. It is shown that the arrangement of compounds among the plates may have important consequences on the ability to design a well-targeted and cost-effective experiment. To that end, two simple and effective schemes for the construction of homogeneous and heterogeneous plates are outlined, using a novel similarity sorting algorithm based on one-dimensional nonlinear mapping.  相似文献   

18.
19.
20.
Rose-scented geranium, a commercially important cultivar originating from Pelargonium graveolens L’Her. ex Ait., is a high value essential oil extensively used in flavour and fragrance formulations. The oil is variable in composition with ‘Bourbon geranium’ (from Reunion Island) regarded as the highest quality geranium oil. Quality assessment of geranium oil involves profiling seven major volatile constituents (geraniol, citronellol, geranyl formate, citronellyl formate, linalool, isomenthone and guaia-6,9-diene) using gas chromatography (GC). The aim of this study was to explore the feasibility of vibrational spectroscopy in tandem with chemometric methods as a rapid and low-cost alternative quality control method. Geranium oil samples (n = 70) were obtained from different suppliers representing cultivation sites in South Africa, Egypt, India, Reunion Island, China and Madagascar. Reference analysis was performed using gas chromatography coupled to mass spectrometry (GC–MS). The mid-infrared (MIR) and near-infrared (NIR) spectra of the oils were recorded with a total of 32 scans accumulated for each sample. Partial least squares (PLS) multivariate calibration models were developed. The calibration models obtained for both MIR and NIR data produced good correlation coefficients (R2 > 0.90) between the predicted and reference values for all seven marker molecules. Generally, the error parameters (RMSEE and RMSEP) after external validation were low (<1.0%) for all compounds guaranteeing reliable predictions. The results show convincingly the potential of both MIRS and NIRS as alternative methods that can be used in quality assessment of geranium oil providing sufficiently accurate results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号