共查询到20条相似文献,搜索用时 467 毫秒
1.
构建147个有机物分子结构与其热导率值之间的定量结构-性质关系(QSPR)模型, 探讨影响有机物热导率的结构因素. 以147个化合物作为样本集, 随机选择118个作为训练集, 29个作为测试集. 应用CODESSA软件计算了组成、拓扑、几何、静电和量子化学等描述符, 通过启发式方法(HM)筛选得到5个结构参数并建立线性回归模型; 用所选5个结构参数作为支持向量机(SVM)的输入, 建立非线性的支持向量机回归模型. 预测结果表明: 支持向量机回归模型的性能(复相关系数R2=0.9240)虽略低于启发式回归模型的性能(R2=0.9267), 但是支持向量机方法预测性能(R2=0.9682)高于启发式方法的预测性能(R2=0.9574), 对于QSPR模型来说, 预测性能更重要. 因此, 总体来说支持向量机方法优于启发式方法. 支持向量机方法和启发式方法的提出为工程上提供了一种根据分子结构预测有机物热导率的新方法. 相似文献
2.
3.
对89 个苯并异噻唑和苯并噻嗪类丙型肝炎病毒(HCV) NS5B聚合酶非核苷抑制剂进行了定量构效关系(QSAR)研究. 采用遗传算法组合偏最小二乘(GA-PLS)和线性逐步回归分析(LSRA)两种特征选择方法选择最优描述符子集, 然后建立多元线性回归和偏最小二乘线性回归模型. 并首次尝试使用遗传算法耦合支持向量机方法(GA-SVM)对两种特征选择方法所选的描述符子集分别建立非线性支持向量机回归模型. 三种机器学习方法所建模型均得到比较满意的预测效果. 采用LSRA所选的6 个描述符建立的三个QSAR模型对于测试集的相关系数为0.958-0.962, GA-SVM法给出最好的预测精度(0.962). 采用GA-PLS所选的7个描述符建立的三个QSAR模型对于测试集的相关系数为0.918-0.960, 偏最小二乘回归模型的结果最好(0.960). 本工作提供了一种有效的方法来预测丙型肝炎病毒抑制剂的生物活性, 该方法也可以扩展到其他类似的定量构效关系研究领域. 相似文献
4.
5.
6.
运用B3LYP/STO-3G和ZINDO两种低水平的量子化学方法计算了160个有机分子的UV-Vis吸收光谱, 然后提取合适的物理参数, 并以实验值为基础, 引入最小二乘支持向量机方法以提高吸收能的计算值精度. 结果表明, 最小二乘支持向量机方法可有效提高量子化学计算精度, 体系的吸收能误差均方根分别从0.95和0.46 eV降低到0.16和0.15 eV. 最小二乘支持向量机校正方法的引入可在较少的机时和计算资源下得到比单一的量子化学计算方法更为稳定和精确的计算结果, 且可在现有计算条件下预测现有计算能力达不到的精度. 因此, 将最小二乘支持向量机方法用于量子化学数据分析, 为化学研究准确、 快捷地预测分子性质提供了一种新的研究手段. 相似文献
7.
8.
真核细胞中的泛素-蛋白酶体系统在抗原肽加工呈递途径中发挥着重要作用. 为了进一步研究蛋白酶体的酶切位点特异性, 采用支持向量机(support vector machine, SVM)方法建立了蛋白酶体的酶切位点预测模型. 在相同检验集下, 将本模型性能指标与其他预测模型方法比较, 结果是本模型性能指标优先, 预测准确度为83.1%. 由样本数据相应氨基酸对酶切位点形成的权重系数, 得出蛋白酶体酶切位点及其两侧区域氨基酸的裂解特异性, 反映了蛋白酶体裂解抗原蛋白的相互作用信息, 这表明蛋白酶体对抗原蛋白的酶切处理不是随机的, 而是有一定模式和选择性的. 相似文献
9.
10.
11.
12.
13.
14.
At present, there are a number of methods for the prediction of T-cell epitopes and major histocompatibility complex (MHC)-binding peptides. Despite numerous methods for predicting T-cell epitopes, there still exist limitations that affect the reliability of prevailing methods. For this reason, the development of models with high accuracy are crucial. An accurate prediction of the peptides that bind to specific major histocompatibility complex class I and II (MHC-I and MHC-II) molecules is important for an understanding of the functioning of the immune system and the development of peptide-based vaccines. Peptide binding is the most selective step in identifying T-cell epitopes. In this paper, we present a new approach to predicting MHC-binding ligands that takes into account new weighting schemes for position-based amino acid frequencies, BLOSUM and VOGG substitution of amino acids, and the physicochemical and molecular properties of amino acids. We have made models for quantitatively and qualitatively predicting MHC-binding ligands. Our models are based on two machine learning methods support vector machine (SVM) and support vector regression (SVR), where our models have used for feature selection, several different encoding and weighting schemes for peptides. The resulting models showed comparable, and in some cases better, performance than the best existing predictors. The obtained results indicate that the physicochemical and molecular properties of amino acids (AA) contribute significantly to the peptide-binding affinity. 相似文献
15.
《Analytical letters》2012,45(18):2849-2859
ABSTRACTA novel method was developed for the quality control of Ephedrae herba by near-infrared (NIR) spectroscopy. First, qualitative models established by discriminant analysis and support vector machine were used for the preliminary screening of unqualified samples of E. herba. Then quantitative models of ephedrine and the total alkali (ephedrine and pseudoephedrine) were established by partial least squares regression and particle swarm optimization based least square support vector machine. The contents of test samples were predicted by the established NIR quantitative models. As a result, the accuracies of unqualified identification were 98.9% by discriminant analysis and 100% by support vector machine. The performance of the particle swarm optimization based least square support vector machine models were better than the partial least squares regression models. The correlation coefficients were both more than 0.98 and relative standard errors of calibrations were less than 9% in the calibration sets of particle swarm optimization based least square support vector machine models. As for the test sets, the correlation coefficients were both more than 0.93 and the relative standard errors of prediction were less than 13%, indicating satisfactory predicted results. All of these results demonstrated that NIR spectroscopy may be a powerful tool for the quality control of E. herba. 相似文献
16.
应用氨基酸描述子VHSE(Principal component score vector of hydrophobic, steric, and electronic properties)对613个抗原9肽进行结构表征, 在此基础上, 采用支持向量机结合逐步回归变量筛选方法, 成功建立了抗原肽抗原处理相关转运蛋白(Transporter associated with antigen processing, TAP)亲和活性预测模型, 最优线性支持向量机模型的R2, Q2和R2ext分别为0.7386, 0.7270和0.6057. 模型结果分析表明, 影响TAP亲和活性的首要因素是电性, 其次是立体和疏水性质; 底物9肽的P1(N端)及P2, P7和P9(C端)位氨基酸物化性质对TAP亲和活性有重要影响, 而P3, P4, P5和P6位对模型贡献相对较小, P8位则与活性无关. 依据最优模型对模拟点突变9肽的TAP亲和活性的预测结果, 并结合变量载荷分析, 对TAP底物选择特异性进行了分析和总结. 相似文献
17.
18.
基于核函数的非线性分类相关分析及其在化学模式识别中的应用 总被引:1,自引:0,他引:1
与统计分析和神经网络相比,基于结构风险最小的支持向量机有更好的分类性能。它用于非线性分类时,先将样本映射到更高维的特征空间,往往会增加复共线性与冗余信息,将影响样本分布,降低线性支持向量机分类器(LSVC)的预测性能。本研究提出非线性分类相关分析算法(NLCCA),利用核函数技术,无需了解非线性映射的算式,从特征空间的样本映像中提取分类相关成分,以消除冗余信息,改善样本分布。由此构建的NLCCA-LSVC集成分类器具有优良的预测性能。经模拟数据的测试,并实际用于两个复杂的化学模式识别问题,均取得令人满意的效果,也印证了算法的有效性。 相似文献
19.
Protein–protein interactions (PPIs) play essential roles in many biological processes. In protein–protein interaction networks, hubs involve in numbers of PPIs and may constitute an important source of drug targets. The intrinsic disorder proteins (IDPs) with unstable structures can promote the promiscuity of hubs and also involve in many disease pathways, so they also could serve as potential drug targets. Moreover, proteins with similar functions measured by semantic similarity of gene ontology (GO) terms tend to interact with each other. Here, the relationship between hub proteins and drug targets based on GO terms and intrinsic disorder was explored. The semantic similarities of GO terms and genes between two proteins, and the rate of intrinsic disorder residues of each protein were extracted as features to characterize the functional similarity between two interacting proteins. Only using 8 feature variables, prediction models by support vector machine (SVM) were constructed to predict PPIs. The accuracy of the model on the PPI data from human hub proteins is as high as 83.72%, which is very promising compared with other PPI prediction models with hundreds or even thousands of features. Then, 118 of 142 PPIs between hubs are correctly predicted that the two interacting proteins are targets of the same drugs. The results indicate that only 8 functional features are fully efficient for representing PPIs. In order to identify new targets from IDP dataset, the PPIs between hubs and IDPs are predicted by the SVM model and the model yields a prediction accuracy of 75.84%. Further research proves that 3 of 5 PPIs between hubs and IDPs are correctly predicted that the two interacting proteins are targets of the same drugs. All results demonstrate that the model with only 8-dimensional features from GO terms and intrinsic disorder still gives a good performance in predicting PPIs and further identifying drug targets. 相似文献