首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Various in vitro and in-silico methods have been used for drug genotoxicity tests, which show limited genotoxicity (GT+) and non-genotoxicity (GT−) identification rates. New methods and combinatorial approaches have been explored for enhanced collective identification capability. The rates of in-silco methods may be further improved by significantly diversified training data enriched by the large number of recently reported GT+ and GT− compounds, but a major concern is the increased noise levels arising from high false-positive rates of in vitro data. In this work, we evaluated the effect of training data size and noise level on the performance of support vector machines (SVM) method known to tolerate high noise levels in training data. Two SVMs of different diversity/noise levels were developed and tested. H-SVM trained by higher diversity higher noise data (GT+ in any in vivo or in vitro test) outperforms L-SVM trained by lower noise lower diversity data (GT+ in in vivo or Ames test only). H-SVM trained by 4,763 GT+ compounds reported before 2008 and 8,232 GT− compounds excluding clinical trial drugs correctly identified 81.6% of the 38 GT+ compounds reported since 2008, predicted 83.1% of the 2,008 clinical trial drugs as GT−, and 23.96% of 168 K MDDR and 27.23% of 17.86M PubChem compounds as GT+. These are comparable to the 43.1–51.9% GT+ and 75–93% GT− rates of existing in-silico methods, 58.8% GT+ and 79% GT− rates of Ames method, and the estimated percentages of 23% in vivo and 31–33% in vitro GT+ compounds in the “universe of chemicals”. There is a substantial level of agreement between H-SVM and L-SVM predicted GT+ and GT− MDDR compounds and the prediction from TOPKAT. SVM showed good potential in identifying GT+ compounds from large compound libraries based on higher diversity and higher noise training data.  相似文献   

3.
Tyrosine sulfation is a post‐translational modification of many secreted and membrane‐bound proteins. It governs protein‐protein interactions that are involved in leukocyte adhesion, hemostasis, and chemokine signaling. However, the intrinsic feature of sulfated protein remains elusive and remains to be delineated. This investigation presents SulfoSite, which is a computational method based on a support vector machine (SVM) for predicting protein sulfotyrosine sites. The approach was developed to consider structural information such as concerning the secondary structure and solvent accessibility of amino acids that surround the sulfotyrosine sites. One hundred sixty‐two experimentally verified tyrosine sulfation sites were identified using UniProtKB/SwissProt release 53.0. The results of a five‐fold cross‐validation evaluation suggest that the accessibility of the solvent around the sulfotyrosine sites contributes substantially to predictive accuracy. The SVM classifier can achieve an accuracy of 94.2% in five‐fold cross validation when sequence positional weighted matrix (PWM) is coupled with values of the accessible surface area (ASA). The proposed method significantly outperforms previous methods for accurately predicting the location of tyrosine sulfation sites. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009  相似文献   

4.
5.
High-resolution mass spectrometry is a promising technique in non-target screening (NTS) to monitor contaminants of emerging concern in complex samples. Current chemical identification strategies in NTS experiments typically depend on spectral libraries, chemical databases, and in silico fragmentation tools. However, small molecule identification remains challenging due to the lack of orthogonal sources of information (e.g., unique fragments). Collision cross section (CCS) values measured by ion mobility spectrometry (IMS) offer an additional identification dimension to increase the confidence level. Thanks to the advances in analytical instrumentation, an increasing application of IMS hybrid with high-resolution mass spectrometry (HRMS) in NTS has been reported in the recent decades. Several CCS prediction tools have been developed. However, limited CCS prediction methods were based on a large scale of chemical classes and cross-platform CCS measurements. We successfully developed two prediction models using a random forest machine learning algorithm. One of the approaches was based on chemicals’ super classes; the other model was direct CCS prediction using molecular fingerprint. Over 13,324 CCS values from six different laboratories and PubChem using a variety of ion-mobility separation techniques were used for training and testing the models. The test accuracy for all the prediction models was over 0.85, and the median of relative residual was around 2.2%. The models can be applied to different IMS platforms to eliminate false positives in small molecule identification.  相似文献   

6.
A novel method of cell affinity screening (CAS), cell affinity capture coupled with LC‐MS analysis, was developed for screening the bioactive compounds related to cardiovascular diseases from the natural product libraries. One of the major characteristics lies in its function in affinity‐capturing and separating the bioactive components from the natural product libraries in vitro. Another characteristic is its use in analyzing and identifying the target compounds, by employing high‐performance liquid chromatography and mass spectrometry. CAS was used for screening the bioactive components from the alkaloid extract derived from Aconitum szechenyianum Gay. Of the five components found to be bound to the oxidative‐damaged endothelial cells, the two compounds identified, mesaconitine and aconitine, were recognized in the literature as being related to cardiovascular diseases. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

7.
Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influenza virus (AIV) were developed using support vector machine (SVM) and linear discriminant analysis (LDA). The results obtained from LDA are as follows: the identification accuracy (Ria) for training samples is 99.8% and Ria by leave one out cross validation is 99.5%. Both Ria of 99.8% for training samples and Ria of 99.3% by leave one out cross validation are obtained using SVM model, respectively. External 200 HA proteins of influenza virus were used to validate the external predictive power of the resulting model. The external Ria for them is 95.5% by LDA and 96.5% by SVM, respectively, which shows that HA proteins of AIVs are preferably recognized by SVM and LDA, and the performances by SVM are superior to those by LDA.  相似文献   

8.
9.
Chemical libraries contain thousands of compounds that need screening, which increases the need for computational methods that can rank or prioritize compounds. The tools of virtual screening are widely exploited to enhance the cost effectiveness of lead drug discovery programs by ranking chemical compounds databases in decreasing probability of biological activity based upon probability ranking principle (PRP). In this paper, we developed a novel ranking approach for molecular compounds inspired by quantum mechanics, called quantum probability ranking principle (QPRP). The QPRP ranking criteria would make an attempt to draw an analogy between the physical experiment and molecular structure ranking process for 2D fingerprints in ligand based virtual screening (LBVS). The development of QPRP criteria in LBVS has employed the concepts of quantum at three different levels, firstly at representation level, this model makes an effort to develop a new framework of molecular representation by connecting the molecular compounds with mathematical quantum space. Secondly, estimate the similarity between chemical libraries and references based on quantum-based similarity searching method. Finally, rank the molecules using QPRP approach. Simulated virtual screening experiments with MDL drug data report (MDDR) data sets showed that QPRP outperformed the classical ranking principle (PRP) for molecular chemical compounds.  相似文献   

10.
Biocatalytic halogenation with tryptophan halogenases is hampered by severe limitations such as low activity and stability. These drawbacks can be overcome by directed evolution, but for screening large mutant libraries, a facile high‐throughput method is required. Therefore, we developed a quantitative halogenase assay based on a Suzuki–Miyaura cross‐coupling towards the formation of a fluorescent aryltryptophan. The technique was optimized for application in crude E. coli lysate without intermediary purification steps, and was used for quantitatively monitoring the formation of halogenated tryptophans with high specificity by facile fluorescence screening in microtiter plates. This novel screening approach was exploited to engineer a thermostable tryptophan 6‐halogenase. Libraries were constructed by error‐prone PCR and selected for improved thermal resistance simply by fluorogenic cross‐coupling. Our method led to an enzyme variant with substantially increased thermal stability and 2.5‐fold improved activity.  相似文献   

11.
An ion mobility quadrupole time‐of‐flight mass spectrometry‐based pesticide suspect screening methodology was developed and validated covering 20 plant‐derived food matrices deriving from six commodity groups of different complexity according to the actual European Commission document SANTE/11813/2017 applying a QuEChERS sample preparation protocol. The method combines ultra‐performance liquid chromatography, traveling wave ion mobility, and quadrupole time‐of‐flight mass spectrometry. Besides the determination of the physicochemical property collision cross‐section and the establishment of a corresponding scientific suspect screening database comprising 280 pesticides for several pesticides, different protomers, sodium adducts, as well as dimers were identified in ion mobility spectrometry traces. Additionally, collision cross‐section values were included in the validation requirements regarding chromatography and mass spectrometry for the detection of pesticides. A collision cross‐section value window was analyzed within a tolerable error of ±2%. For this cross‐matrix validation, screening detection limits were determined at concentration levels of 0.100 mg/kg (84% of the original pesticide scope), 0.010 mg/kg (56%), and 0.001 mg/kg (21%). By application of ion mobility spectrometry, the compound identification was improved due to independence of commodity of concern and concentration levels of analyte molecules, as false assignments are reduced by application of a collision cross‐section range.  相似文献   

12.
13.
提出一种新的组合方法用于β-turns预测和特征分析.该方法包括两步:如何表征β-turns特征和如何构建其预测模型.第一步应用氨基酸广义信息因子分析标度表征蛋白质中β-turns的结构特征,该标度涉及氨基酸的疏水性、α-螺旋与转角倾向、体积性质、构成特征、局部柔性及静电性.第二步以426个蛋白质为训练集样本,通过留1/7法交互验证,基于支持向量机建立β-turns预测模型.该模型分别成功地预测547和823个蛋白的β-turns.所得结果与所对比方法结果相当,更重要的是,SVM模型提供了一些关于β-turns特征的重要结构信息.该组合方法可以进一步尝试用于蛋白质结构预测及特征分析.  相似文献   

14.
High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probability of the identification of novel biologically active compounds by mining the HTS data. HTS data is notoriously noisy, and therefore, the selection of the optimal data mining method is important for the success of such an analysis. Here, we describe a retrospective analysis of four HTS data sets using three mining approaches: Laplacian-modified naive Bayes, recursive partitioning, and support vector machine (SVM) classifiers with increasing stochastic noise in the form of false positives and false negatives. All three of the data mining methods at hand tolerated increasing levels of false positives even when the ratio of misclassified compounds to true active compounds was 5:1 in the training set. False negatives in the ratio of 1:1 were tolerated as well. SVM outperformed the other two methods in capturing active compounds and scaffolds in the top 1%. A Murcko scaffold analysis could explain the differences in enrichments among the four data sets. This study demonstrates that data mining methods can add a true value to the screen even when the data is contaminated with a high level of stochastic noise.  相似文献   

15.
16.
17.
Methods for the rapid and inexpensive discovery of hit compounds are essential for pharmaceutical research and DNA‐encoded chemical libraries represent promising tools for this purpose. We here report on the design and synthesis of DAL‐100K, a DNA‐encoded chemical library containing 103 200 structurally compact compounds. Affinity screening experiments and DNA‐sequencing analysis provided ligands with nanomolar affinities to several proteins, including prostate‐specific membrane antigen and tankyrase 1. Correlations of sequence counts with binding affinities and potencies of enzyme inhibition were observed and enabled the identification of structural features critical for activity. These results indicate that libraries of this type represent a useful source of small‐molecule binders for target proteins of pharmaceutical interest and information on structural features important for binding.  相似文献   

18.
Comparative molecular field analysis (CoMFA),a three dimensional quantitative structure-activity relationship (3D-QSAR) method was applied to a series of diindolylmethane(DIM) analogs to study the relationship between their structure and their induction of CYP 1A1-associated ethoxyresorufin-O-deethylase(EROD) activity.A DISCO model of pharmacophore was derved to guide the superposition of the compounds.The coefficient of cross-validation (q^2) and non cross-validation(r^2) for the model established by the study are 0.827 and 0.988 respectively,the value of variance ratio (F) is 103.53 and standard error estimate (SEE)is 0.044.These values indicate that the CoMFA model derived is significant and might have a good prediction for the catalytic activity of DIM compounds.As a consequence,the predicted activity values of new designed compounds were all higher than that of the reported value.  相似文献   

19.
20.
In this paper, two 3‐dimensional quantitative structure‐activity relationship models for 60 human immunodeficiency virus (HIV)‐1 protease inhibitors were established using random sampling analysis on molecular surface and translocation comparative molecular field vector analysis (Topomer CoMFA). The non–cross‐validation (r2), cross‐validation (q2), correlation coefficient of external validation (Q2ext), and F of 2 models were 0.94, 0.80, 0.79, and 198.84 and 0.94, 0.72, 0.75, and 208.53, respectively. The results indicated that 2 models were reasonable and had good prediction ability. Topomer Search was used to search R groups in the ZINC database, 20 new compounds were designed, and the Topomer CoMFA model was used to predicate the biological activity. The results showed that 18 new compounds were more active than the template molecule. So the Topomer Search is effective in screening and can guide the design of new HIV/AIDS drugs. The mechanism of action was studied by molecular docking, and it showed that the protease inhibitors and Ile50, Asp25, and Arg8 sites of HIV‐1 protease have interactions. These results have provided an insight for the design of new potent inhibitors of HIV‐1 protease.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号