首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
A new structure–activity relationship model predicting the probability for a compound to inhibit human cytochrome P450 3A4 has been developed using data for >800 compounds from various literature sources and tested on PubChem screening data. Novel GALAS (Global, Adjusted Locally According to Similarity) modeling methodology has been used, which is a combination of baseline global QSAR model and local similarity based corrections. GALAS modeling method allows forecasting the reliability of prediction thus defining the model applicability domain. For compounds within this domain the statistical results of the final model approach the data consistency between experimental data from literature and PubChem datasets with the overall accuracy of 89%. However, the original model is applicable only for less than a half of PubChem database. Since the similarity correction procedure of GALAS modeling method allows straightforward model training, the possibility to expand the applicability domain has been investigated. Experimental data from PubChem dataset served as an example of in-house high-throughput screening data. The model successfully adapted itself to both data classified using the same and different IC50 threshold compared with the training set. In addition, adjustment of the CYP3A4 inhibition model to compounds with a novel chemical scaffold has been demonstrated. The reported GALAS model is proposed as a useful tool for virtual screening of compounds for possible drug-drug interactions even prior to the actual synthesis.  相似文献   

3.
Analytical pyrolysis combined with gas chromatography and mass spectrometry (Py-GC–MS) is a relatively rapid (1–3 h) method for the investigation of polymers. Various wood tissues from transgenic poplar clones and from control samples have been subjected to a screening test by Py-GC–MS. Pyrolysis products from lignin- and carbohydrate-derived pyrolysis products were subjected to multivariate principal component analysis (PCA). The first three PC accounting for 39–72% of the total variance in the original data set could be attributed to vinyl products from lignin and levoglucosan from cellulose. Samples with gene construct rbcs-rol C were only discriminated by plotting PC1 versus PC3 using the whole data set. However, the wood from trees containing gene construct 35 S-rol C were discriminated in all examined models indicating significant impacts during biosynthesis of the wood. One sample within the data set was further clustered because it turned out that this tree died off after two vegetation periods.  相似文献   

4.
5.
In patients with depression, the use of 5-HT reuptake inhibitors can improve the condition. Machine learning methods can be used in ligand-based activity prediction processes. In order to predict SERT inhibitors, the SERT inhibitor data from the ChEMBL database was screened and pre-processed. Then 4 machine learning methods (LR, SVM, RF, and KNN) and 4 molecular fingerprints (CDK, Graph, MACCS, and PubChem) were used to build 16 prediction models. The top 5 models of accuracy (Q) in the cross-validation of training set were used to build three different ensemble learning models. In the test1 set, the VOT_CLF3 model had the largest SP (0.871), Q (0.869), AUC (0.919), and MCC (0.728). In the unbalanced test2 set, VOT_CLF3 had the largest SE (0.857), SP (0.867), Q (0.865) and MCC (0.639). VOT_CLF3 was recommended for the virtual screening process of SERT inhibitors. In addition, 12 molecular structural alerts that frequently appear in SERT inhibitors were found (P < 0.05), which provided important reference value for the design work of SERT inhibitors.  相似文献   

6.
7.
Nanostructured materials with tunable structures and functionality are of interest in diverse areas. Herein, metal ions are coordinated with quinones through metal-acetylacetone coordination bonds to generate a class of structurally tunable, universally adhesive, hydrophilic, and pH-degradable materials. A library of metal-quinone networks (MQNs) is produced from five model quinone ligands paired with nine metal ions, leading to the assembly of particles, tubes, capsules, and films. Importantly, MQNs show bidirectional pH-responsive disassembly in acidic and alkaline solutions, where the quinone ligands mediate the disassembly kinetics, enabling temporal and spatial control over the release of multiple components using multilayered MQNs. Leveraging this tunable release and the inherent medicinal properties of quinones, MQN prodrugs with a high drug loading (>89 wt %) are engineered using doxorubicin for anti-cancer therapy and shikonin for the inhibition of the main protease in the SARS-CoV-2 virus.  相似文献   

8.
Dual-specific tyrosine phosphorylation regulated kinase 1 (DYRK1A) has been regarded as a potential therapeutic target of neurodegenerative diseases, and considerable progress has been made in the discovery of DYRK1A inhibitors. Identification of pharmacophoric fragments provides valuable information for structure- and fragment-based design of potent and selective DYRK1A inhibitors. In this study, seven machine learning methods along with five molecular fingerprints were employed to develop qualitative classification models of DYRK1A inhibitors, which were evaluated by cross-validation, test set, and external validation set with four performance indicators of predictive classification accuracy (CA), the area under receiver operating characteristic (AUC), Matthews correlation coefficient (MCC), and balanced accuracy (BA). The PubChem fingerprint-support vector machine model (CA = 0.909, AUC = 0.933, MCC = 0.717, BA = 0.855) and PubChem fingerprint along with the artificial neural model (CA = 0.862, AUC = 0.911, MCC = 0.705, BA = 0.870) were considered as the optimal modes for training set and test set, respectively. A hybrid data balancing method SMOTETL, a combination of synthetic minority over-sampling technique (SMOTE) and Tomek link (TL) algorithms, was applied to explore the impact of balanced learning on the performance of models. Based on the frequency analysis and information gain, pharmacophoric fragments related to DYRK1A inhibition were also identified. All the results will provide theoretical supports and clues for the screening and design of novel DYRK1A inhibitors.  相似文献   

9.
10.
A hierarchical classification of chemical scaffolds (molecular framework, which is obtained by pruning all terminal side chains) has been introduced. The molecular frameworks form the leaf nodes in the hierarchy trees. By an iterative removal of rings, scaffolds forming the higher levels in the hierarchy tree are obtained. Prioritization rules ensure that less characteristic, peripheral rings are removed first. All scaffolds in the hierarchy tree are well-defined chemical entities making the classification chemically intuitive. The classification is deterministic, data-set-independent, and scales linearly with the number of compounds included in the data set. The application of the classification is demonstrated on two data sets extracted from the PubChem database, namely, pyruvate kinase binders and a collection of pesticides. The examples shown demonstrate that the classification procedure handles robustly synthetic structures and natural products.  相似文献   

11.
12.
Fibroblast growth factor receptors (FGFR) are an essential player in oncogenesis and tumor progression. LY2874455 was identified as a pan-FGFR inhibitor and has gone through phase I clinical trial. In the current study, virtual screening was conducted against the PubChem database using a pharmacophore model generated from the crystal structure of FGFR4 inhibited by LY2874455. PubChem 137300327 was identified as the most suitable compound from this screening. Later, molecular docking and molecular dynamics studies conducted with FGFRs corroborated the initial finding. Analysis of ADMET properties disclosed that LY2874455 and PubChem 137300327 share alike properties. Our study suggests that PubChem 137300327 is a potential pan-FGFR inhibitor and can be exploited to treat different cancers following validation in proper wet-lab experiments and study in animal cancer models. This compound also follows Lipinski’s rules and can be used as a lead compound to synthesize more effective anticancer compounds.  相似文献   

13.
The methanesulfonic acid (MSA)—propylene carbonate (PC) system with component concentrations of 0–100% was studied at 30°C using the Multiple Attenuated Total Reflection (MATR) IR spectroscopy. The formation of a strong 1∶1 molecular complex of MSA with PC was established. In the presence of an excess of the acid, a second MSA molecule adds to this complex to give the molecular complex (2MSA)·PC. When excess propylene carbonate is used, the MSA·PC complex is solvated by a propylene carbonate molecule. No protonation of the base or formation of complexes with a strong symmetrical H bond was observed. Continuous absorption was not detected in IR spectra of the solutions. Translated fromIzvestiya Akademii Nauk. Seriya Khimicheskaya, No. 2, pp. 313–318 February, 1999.  相似文献   

14.
Natural products,as major resources for drug discovery historically,are gaining more attentions recently due to the advancement in genomic sequencing and other technologies,which makes them attractive and amenable to drug candidate screening.Collecting and mining the bioactivity information of natural products are extremely important for accelerating drug development process by reducing cost.Lately,a number of publicly accessible databases have been established to facilitate the access to the chemical biology data for small molecules including natural products.Thus,it is imperative for scientists in related fields to exploit these resources in order to expedite their researches on natural products as drug leads/candidates for disease treatment.PubChem,as a public database,contains large amounts of natural products associated with bioactivity data.In this review,we introduce the information system provided at PubChem,and systematically describe the applications for a set of PubChem web services for rapid data retrieval,analysis,and downloading of natural products.We hope this work can serve as a starting point for the researchers to perform data mining on natural products using PubChem.  相似文献   

15.
Pierce KM  Hope JL  Hoggard JC  Synovec RE 《Talanta》2006,70(4):797-804
Comprehensive two-dimensional gas chromatography combined with time-of-flight mass spectrometry (GC × GC-TOFMS) provides high resolution separations of complex samples with a mass spectrum at every point in the separation space. The large volumes of multidimensional data obtained by GC × GC-TOFMS analysis are analyzed using a principal component analysis (PCA) method described herein to quickly and objectively discover differences between complex samples. In this work, we submitted 54 chromatograms to PCA to automatically compare the metabolite profiles of three different species of plants, namely basil (Ocimum basilicum), peppermint (Mentha piperita), and sweet herb stevia (Stevia rebaudiana), where there were 18 chromatograms for each type of plant. The 54 scores of the m/z 73 data set clustered in three groups according to the three types of plants. Principal component 1 (PC 1) separated the stevia cluster from the basil and peppermint clusters, capturing 61.84% of the total variance. Principal component 2 (PC 2) separated the basil cluster from the peppermint cluster, capturing 16.78% of the total variance. The PCA method revealed that relative abundances of amino acids, carboxylic acids, and carbohydrates were responsible for differentiating the three plants. A brief list of the 16 most significant metabolites is reported. After PCA, the 54 scores of the m/z 217 data set clustered in three groups according to the three types of plants, as well, yielding highly loaded variables corresponding with chemical differences between plants that were complementary to the m/z 73 information. The PCA data mining method is applicable to all of the monitored selective mass channels, utilizing all of the collected data, to discover unknown differences in complex sample profiles.  相似文献   

16.
17.
Polyphenolic compositions of Basque and French ciders were determined by HPLC-DAD following thiolysis, in order to characterise and differentiate these beverages and then develop a classification system capable of confirming the authenticities of both kinds of cider. A data set consisting of 165 cider samples and 27 measured features was evaluated using multivariate chemometric techniques, such as cluster analysis and principal component analysis, in order to perform a preliminary study of data structure. Supervised pattern recognition techniques such as linear discriminant analysis (LDA), K-nearest neighbours (KNN), soft independent modelling of class analogy (SIMCA), and multilayer feed-forward artificial neural networks (MLF-ANN) attained classification rules for the two categories using the chemical data, which produced satisfactory results. Authentication systems obtained by combining two of these techniques were proposed. We found that SIMCA and LDA or KNN models achieved 100% hit-rates, since LDA and KNN permit the detection of every Basque cider and SIMCA provides a model for Basque cider that excludes all French ciders. Polyphenolic profiles of the ciders provided enough information to be able to develop classification rules for identifying ciders according to their geographical origin (Basque or French regions). Chemical and organoleptic differences between these two types of cider are probably due to the original and distinctive cidermaking technologies used for their elaboration. Using polyphenic profiles, about 80% of French ciders could be distinguished according to their region of origin (Brittany or Normandy). Although their polyphenolic profiles did not provide enough information to achieve an authentication system for Breton and Norman ciders.Abbreviations AVI Avicularin - CQA Caffeoylquinic acid - CAF Caffeic acid - CAT (+)-catechin - CT-1, -2, -3 Unknown flavan-3-ols - DPn Average degree of polymerization of procyanidins - EC (–)-epicatechin - HCA-7 Ferulic acid - HCA-1, -2 ,-3, 4, -5, -6 Unknown hydroxycinnamic acids - HYP Hyperin - IQC Isoquercitrin - PC Total procyanidins - PCM p-Coumaric acid - PCQ p-Coumaroylquinic acid - PL Phloretin - PLG Phloridzin - PLXG phloretin-2-O-xyloglucoside - PPO Polyphenoloxidase - QCE Quercetin - QCI Quercitrin - RUT Rutin - CA Cluster analysis - KNN K-nearest neighbours - LDA Linear discriminant analysis - MLF-ANN Multilayer feed-forward-artificial neural network - PCA Principal component analysis - PC1 First principal component - PC2 Second principal component - PC3 Third principal component - RMSE Root medium square error - SD Standard deviation - SIMCA Soft independent modelling of class analogy - DAD Diode array detector - HPLC High Performance Liquid Chromatography - ND Not detected  相似文献   

18.
19.
We report an ultrasensitive method for the analysis of glycosphingolipid catabolism. The substrate G(M1) and the set of seven metabolites into which it can be degraded (G(A1), G(M2), G(A2), G(M3), LacCer, GlcCer, and Cer) were labeled with the highly fluorescent dye tetramethylrhodamine. CE with LIF detection was used to assay these compounds with 150 +/- 80 yoctomole mass (1 ymol = 10(-24) mol = 0.6 copies) detection limits and 5 +/- 3 pM concentration detection limits. An alignment algorithm based on migration of two components was employed to correct for drift in the separation. The within-day and between-day precision in peak height was 20%, in peak width 15%, and in adjusted migration time 0.03%. After normalization to total sample injected, the RSD in peak height reduced to 2-6%, which approaches the limit set by molecular shot noise in the number of molecules taken for analysis. PC12 cells were incubated with the labeled G(M1). Fluorescent microscopy demonstrated uptake by the cells. CE was used to separate a cellular homogenate prepared from these cells. A set of peaks was observed, which were tentatively identified based on comigration with the standards. Roughly 120 pL of homogenate was injected, which contained a total of 150 zmol of labeled substrate and products. Metabolite that preserves the fluorescent label can be detected at the yoctomole level, which should allow characterization of this metabolic pathway in single cells.  相似文献   

20.
Despite the increasing detection of emerging substances in the environment, the identity of most are left unknown due to the lack of efficient identification methods. We developed a non‐target analysis method for identifying unknown substances in the environment by liquid chromatography/high‐resolution mass spectrometry (LC/HRMS) with a product ion and neutral loss database (PNDB). The present analysis describes an elucidation method with elemental compositions of the molecules, product ions, and corresponding neutral losses of the unknown substance: (1) with the molecular formula, possible molecular structures are retrieved from two chemical structure databases (PubChem and ChemSpider); then (2) with the elemental compositions of product ions and neutral losses, possible partial structures are retrieved from the PNDB; and finally, (3) molecular structures that match the possible partial structures are listed in order of number of hits. A molecular structure with a higher number of hits is more similar to the structure of the analyzed substance. The performance of the non‐target method was evaluated by simulated analysis of 150 LC/HRMS spectra registered in MassBank. First, all substances of the same mass data (41/41) and 68% (39/57) of the mass data of the same substances not registered in the PNDB were elucidated. It was demonstrated that 14% (7/52) and 31% (16/52) of the substances with no mass spectral data registered in the PNDB were obtained at the first and within the fifth place, respectively. Owing to the fact that 10 of the total hits occurred in product ions and neutral losses, almost 50% of the substances evaluated with this method were placed at the top 4 positions in the similarity ranking. Importantly, the proposed method is effective for analyzing mass spectral data that has not been registered in the PNDB and thus is expected to be used for a variety of non‐target analyses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号