首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 467 毫秒
1.
构建147个有机物分子结构与其热导率值之间的定量结构-性质关系(QSPR)模型, 探讨影响有机物热导率的结构因素. 以147个化合物作为样本集, 随机选择118个作为训练集, 29个作为测试集. 应用CODESSA软件计算了组成、拓扑、几何、静电和量子化学等描述符, 通过启发式方法(HM)筛选得到5个结构参数并建立线性回归模型; 用所选5个结构参数作为支持向量机(SVM)的输入, 建立非线性的支持向量机回归模型. 预测结果表明: 支持向量机回归模型的性能(复相关系数R2=0.9240)虽略低于启发式回归模型的性能(R2=0.9267), 但是支持向量机方法预测性能(R2=0.9682)高于启发式方法的预测性能(R2=0.9574), 对于QSPR模型来说, 预测性能更重要. 因此, 总体来说支持向量机方法优于启发式方法. 支持向量机方法和启发式方法的提出为工程上提供了一种根据分子结构预测有机物热导率的新方法.  相似文献   

2.
本文应用一种组合遗传算法和共轭梯度法的支持向量机(GA-CG-SVM)方法建立了药物诱导磷脂质病分类预测模型.首先对描述符进行了优化,选出了19个描述符用于模型的构建,所建模型对训练集的预测准确率为81.6%,对测试集的预测精度为87.5%,说明所建SVM分类模型不仅能正确预测训练集药物诱导的磷脂质病,也对其他化合物具...  相似文献   

3.
丛湧  薛英 《物理化学学报》2013,29(8):1639-1647
对89 个苯并异噻唑和苯并噻嗪类丙型肝炎病毒(HCV) NS5B聚合酶非核苷抑制剂进行了定量构效关系(QSAR)研究. 采用遗传算法组合偏最小二乘(GA-PLS)和线性逐步回归分析(LSRA)两种特征选择方法选择最优描述符子集, 然后建立多元线性回归和偏最小二乘线性回归模型. 并首次尝试使用遗传算法耦合支持向量机方法(GA-SVM)对两种特征选择方法所选的描述符子集分别建立非线性支持向量机回归模型. 三种机器学习方法所建模型均得到比较满意的预测效果. 采用LSRA所选的6 个描述符建立的三个QSAR模型对于测试集的相关系数为0.958-0.962, GA-SVM法给出最好的预测精度(0.962). 采用GA-PLS所选的7个描述符建立的三个QSAR模型对于测试集的相关系数为0.918-0.960, 偏最小二乘回归模型的结果最好(0.960). 本工作提供了一种有效的方法来预测丙型肝炎病毒抑制剂的生物活性, 该方法也可以扩展到其他类似的定量构效关系研究领域.  相似文献   

4.
预测毛细管区带电泳有效淌度的支持向量回归建模方法   总被引:3,自引:0,他引:3  
康宇飞  瞿海斌  沈朋  程翼宇 《分析化学》2004,32(9):1151-1155
提出预测毛细管电泳迁移行为的支持向量回归建模方法。以核苷为实际研究对象,利用正交试验获得的数据,结合二标记物技术,用支持向量回归算法建立毛细管区带电泳的柱温、电压、缓冲液浓度和pH值与3种核苷的有效淌度之间的相关模型。将其与偏最小二乘回归和人工神经网络方法相比较,结果表明所建模型的预测准确性优于后两者,适宜用于毛细管电泳迁移行为的预测。  相似文献   

5.
将主成分分析(PCA)用于肝功能检测数据特征提取,然后用支持向量机(SVM)对乙肝、丙肝、肝硬化、正常人样本建立分类模型。采用高斯径向基函数(RBF)为核函数,调节核函数参数C及σ以建立最佳支持向量机模型。该模型对训练集的识别率为99.3%,对预测集的预测率为96.4%。结果表明:PCA-SVM法建立的肝病分类模型能较好的区分乙肝、丙肝、肝硬化及正常人,且分类效果优于传统支持向量机及人工神经网络(ANN)分类模型。  相似文献   

6.
运用B3LYP/STO-3G和ZINDO两种低水平的量子化学方法计算了160个有机分子的UV-Vis吸收光谱, 然后提取合适的物理参数, 并以实验值为基础, 引入最小二乘支持向量机方法以提高吸收能的计算值精度. 结果表明, 最小二乘支持向量机方法可有效提高量子化学计算精度, 体系的吸收能误差均方根分别从0.95和0.46 eV降低到0.16和0.15 eV. 最小二乘支持向量机校正方法的引入可在较少的机时和计算资源下得到比单一的量子化学计算方法更为稳定和精确的计算结果, 且可在现有计算条件下预测现有计算能力达不到的精度. 因此, 将最小二乘支持向量机方法用于量子化学数据分析, 为化学研究准确、 快捷地预测分子性质提供了一种新的研究手段.  相似文献   

7.
G蛋白偶联受体广泛参与各类生理活动的调控,目前市场上1/2的小分子药物均是以GPCR为药物靶标。由于G蛋白偶联受体晶体结构缺乏,采用理论方法对G蛋白受体耦合特异性进行分类预测在药物研发领域有着重要的学术和应用价值。因此,本文采用模式识别方法,基于GPCR序列,以伪氨基酸算法以及遗传算法为基础,用支持向量机方法建立了G蛋白偶联受体耦合特异性的预测模型,取得了可达82.5%的较高的预测准确度。  相似文献   

8.
刘涛  宋哲  焦春波  刘伟  朱鸣华  王晓钢 《化学学报》2008,66(21):2341-2347
真核细胞中的泛素-蛋白酶体系统在抗原肽加工呈递途径中发挥着重要作用. 为了进一步研究蛋白酶体的酶切位点特异性, 采用支持向量机(support vector machine, SVM)方法建立了蛋白酶体的酶切位点预测模型. 在相同检验集下, 将本模型性能指标与其他预测模型方法比较, 结果是本模型性能指标优先, 预测准确度为83.1%. 由样本数据相应氨基酸对酶切位点形成的权重系数, 得出蛋白酶体酶切位点及其两侧区域氨基酸的裂解特异性, 反映了蛋白酶体裂解抗原蛋白的相互作用信息, 这表明蛋白酶体对抗原蛋白的酶切处理不是随机的, 而是有一定模式和选择性的.  相似文献   

9.
支持向量机方法预测有机物的亨利常数   总被引:4,自引:2,他引:4  
以有机物摩尔体积V、偶极项π*、氢键给予体的酸性am、氢键接受体的碱性βm等四种理化参数为输入变量,利用支持向量机方法对72种有机物的亨利常数值进行了定量预测研究。研究发现,采用支持向量机方法可以实现使用较少样本数据建模,并达到较好的预测结果。支持向量方法的预测结果远优于线性回归法预测结果。  相似文献   

10.
预测小分子与蛋白质的共价结合对基于共价结合的药物筛选十分重要。目前基于结构的虚拟筛选工具主要面向药物和靶标的非共价对接。本文在前期研究的基础上,用深度学习技术中的双向长短期记忆循环网络方法和自注意机制,根据实验数据,预测小分子与蛋白质的共价结合能力。蛋白质与小分子配体共价结合主要有半胱氨酸或丝氨酸结合类型。半胱氨酸共价结合类型的数据量较大,本文主要预测半胱氨酸类型的共价结合。与传统的机器学习模型,如随机森林,逻辑回归,支持向量机模型相比,该深度学习模型的预测能力有显著的改进。  相似文献   

11.
12.
13.
14.
At present, there are a number of methods for the prediction of T-cell epitopes and major histocompatibility complex (MHC)-binding peptides. Despite numerous methods for predicting T-cell epitopes, there still exist limitations that affect the reliability of prevailing methods. For this reason, the development of models with high accuracy are crucial. An accurate prediction of the peptides that bind to specific major histocompatibility complex class I and II (MHC-I and MHC-II) molecules is important for an understanding of the functioning of the immune system and the development of peptide-based vaccines. Peptide binding is the most selective step in identifying T-cell epitopes. In this paper, we present a new approach to predicting MHC-binding ligands that takes into account new weighting schemes for position-based amino acid frequencies, BLOSUM and VOGG substitution of amino acids, and the physicochemical and molecular properties of amino acids. We have made models for quantitatively and qualitatively predicting MHC-binding ligands. Our models are based on two machine learning methods support vector machine (SVM) and support vector regression (SVR), where our models have used for feature selection, several different encoding and weighting schemes for peptides. The resulting models showed comparable, and in some cases better, performance than the best existing predictors. The obtained results indicate that the physicochemical and molecular properties of amino acids (AA) contribute significantly to the peptide-binding affinity.  相似文献   

15.
《Analytical letters》2012,45(18):2849-2859
ABSTRACT

A novel method was developed for the quality control of Ephedrae herba by near-infrared (NIR) spectroscopy. First, qualitative models established by discriminant analysis and support vector machine were used for the preliminary screening of unqualified samples of E. herba. Then quantitative models of ephedrine and the total alkali (ephedrine and pseudoephedrine) were established by partial least squares regression and particle swarm optimization based least square support vector machine. The contents of test samples were predicted by the established NIR quantitative models. As a result, the accuracies of unqualified identification were 98.9% by discriminant analysis and 100% by support vector machine. The performance of the particle swarm optimization based least square support vector machine models were better than the partial least squares regression models. The correlation coefficients were both more than 0.98 and relative standard errors of calibrations were less than 9% in the calibration sets of particle swarm optimization based least square support vector machine models. As for the test sets, the correlation coefficients were both more than 0.93 and the relative standard errors of prediction were less than 13%, indicating satisfactory predicted results. All of these results demonstrated that NIR spectroscopy may be a powerful tool for the quality control of E. herba.  相似文献   

16.
应用氨基酸描述子VHSE(Principal component score vector of hydrophobic, steric, and electronic properties)对613个抗原9肽进行结构表征, 在此基础上, 采用支持向量机结合逐步回归变量筛选方法, 成功建立了抗原肽抗原处理相关转运蛋白(Transporter associated with antigen processing, TAP)亲和活性预测模型, 最优线性支持向量机模型的R2, Q2和R2ext分别为0.7386, 0.7270和0.6057. 模型结果分析表明, 影响TAP亲和活性的首要因素是电性, 其次是立体和疏水性质; 底物9肽的P1(N端)及P2, P7和P9(C端)位氨基酸物化性质对TAP亲和活性有重要影响, 而P3, P4, P5和P6位对模型贡献相对较小, P8位则与活性无关. 依据最优模型对模拟点突变9肽的TAP亲和活性的预测结果, 并结合变量载荷分析, 对TAP底物选择特异性进行了分析和总结.  相似文献   

17.
18.
与统计分析和神经网络相比,基于结构风险最小的支持向量机有更好的分类性能。它用于非线性分类时,先将样本映射到更高维的特征空间,往往会增加复共线性与冗余信息,将影响样本分布,降低线性支持向量机分类器(LSVC)的预测性能。本研究提出非线性分类相关分析算法(NLCCA),利用核函数技术,无需了解非线性映射的算式,从特征空间的样本映像中提取分类相关成分,以消除冗余信息,改善样本分布。由此构建的NLCCA-LSVC集成分类器具有优良的预测性能。经模拟数据的测试,并实际用于两个复杂的化学模式识别问题,均取得令人满意的效果,也印证了算法的有效性。  相似文献   

19.
Protein–protein interactions (PPIs) play essential roles in many biological processes. In protein–protein interaction networks, hubs involve in numbers of PPIs and may constitute an important source of drug targets. The intrinsic disorder proteins (IDPs) with unstable structures can promote the promiscuity of hubs and also involve in many disease pathways, so they also could serve as potential drug targets. Moreover, proteins with similar functions measured by semantic similarity of gene ontology (GO) terms tend to interact with each other. Here, the relationship between hub proteins and drug targets based on GO terms and intrinsic disorder was explored. The semantic similarities of GO terms and genes between two proteins, and the rate of intrinsic disorder residues of each protein were extracted as features to characterize the functional similarity between two interacting proteins. Only using 8 feature variables, prediction models by support vector machine (SVM) were constructed to predict PPIs. The accuracy of the model on the PPI data from human hub proteins is as high as 83.72%, which is very promising compared with other PPI prediction models with hundreds or even thousands of features. Then, 118 of 142 PPIs between hubs are correctly predicted that the two interacting proteins are targets of the same drugs. The results indicate that only 8 functional features are fully efficient for representing PPIs. In order to identify new targets from IDP dataset, the PPIs between hubs and IDPs are predicted by the SVM model and the model yields a prediction accuracy of 75.84%. Further research proves that 3 of 5 PPIs between hubs and IDPs are correctly predicted that the two interacting proteins are targets of the same drugs. All results demonstrate that the model with only 8-dimensional features from GO terms and intrinsic disorder still gives a good performance in predicting PPIs and further identifying drug targets.  相似文献   

20.
基于支撑向量机方法的有机化合物的生成Gibbs自由能的预测;支撑向量机;多元线形回归;吉布斯自由能  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号