首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
支持向量机分类和回归用于肽的QSAR研究   总被引:4,自引:0,他引:4  
周鹏  曾晖  李波  周原  李志良 《化学通报》2006,69(5):342-346
使用支持向量机技术对两类肽化合物体系进行了分类和回归研究,并将其系统地与K最邻近法、多元线性回归、偏最小二乘、人工神经网络进行了比较。结果表明,对于小样本、非线性问题,支持向量机具有较强的稳定性能及泛化能力,在大多数情况下能够得到优于传统方法的建模效果。对于分类问题,支持向量机对训练集和测试集都达到了100%的分类正确率;对于回归问题,支持向量机虽对训练集样本拟合效果略低于人工神经网络,但对外部测试集却表现出较强的预测能力。  相似文献   

4.
氨基酸结构描述子矢量VHSE及其在肽QSAR中的应用   总被引:8,自引:0,他引:8  
从20种天然氨基酸的50个物化性质出发,按照疏水、立体和电性特征将其分类后分别进行主成分分析,并将产生的得分矢量即VHSE(principal component score vector of hydrophilicity,steric,and electronic properties)作为氨基酸结构描述子用于肽的定量构效关系研究。与已有方法相比,VHSE描述子具有物化意义明确、结果更易解释等特点。应用该描述子并结合逐步回归变量筛选和偏最小二乘建模方法,在对苦味二肽和血管舒缓激肽促进剂等体系的定量构效关系研究中,均取得了优于已有文献的结果。  相似文献   

5.
支持向量机用于多氯代萘毒性的定量构效研究   总被引:2,自引:0,他引:2  
用偏最小二乘法(PLS)和留一交叉验证从90多个量子化学参数中筛选出极化率、分子量、部分原子上的净电荷、静电势等作为描述符,应用支持向量机(SVM)对20个多氯代萘同系物的三组毒性数据分别建立了定量构效关系模型.所得模型的交叉验证相关系数的平方分别为0.805、0.890、0.936.并将偏最小二乘法建模所得结果与之进行比较,结果表明,SVM预报能力优于PLS.  相似文献   

6.
7.
8.
一种新的氨基酸描述子及其在肽QSAR中的应用   总被引:11,自引:0,他引:11  
从天然氨基酸的25个结构与拓扑变量中经主成分分析得到一种新的氨基酸描述子——VSTV (principal component scores vector of structural and topological variables).应用该描述子对以下3个体系,即血管紧张素转化酶抑制剂(2肽)、抗菌18肽和促凝血酶原激酶抑制剂(6~12肽)进行分子结构参数化表达,并在此基础上通过偏最小二乘回归(PLSR)建立定量构效关系(QSAR)模型,取得了优于文献的结果.模型的复相关系数(R2)和交互检验复相关系数(Q2)分别为0.789, 0.767; 0.996, 0.879; 0.981, 0.480.  相似文献   

9.
A voltammetric method is proposed for the simultaneous determination of tryptophan, cysteine, and tyrosine using multivariate calibration techniques. Various electrodes and voltammetric techniques were explored to ascertain the optimum measurement strategy. Among them, differential pulse voltammetry (DPV) with a Pt electrode was selected as analytical technique since it provided a suitable compromise between sensitivity and reproducibility while allowing the oxidation peaks of the three compounds to be reasonably discriminated. The sensitivity of DPV with Pt electrode for Trp standards was 8.4×10−2 A l mol−1, the repeatability 3.7% and the detection limit below 10−7 M. The lack of full selectivity of the voltammetric data was overcome using multivariate calibration methods on the basis of the differences in the voltammetric waves of each compound. The accuracy of predictions was evaluated preliminarily from the analysis of three-component synthetic mixtures. Subsequently, this method was applied to the analysis of oxidizable amino acids in feed samples. Results obtained were in good concordance with those given by the standard method using an amino acid analyzer.  相似文献   

10.
11.
从20种天然氨基酸197个GETAWAY指数经主成分分析得出一种新3D氨基酸描述子——VSGETAWAY[vector of principal component scores for GETAWAY (geometry, topology and atom-weights assembly)]. 将其应用于48个苦味活性二肽、31个血管舒缓激肽促进剂和20个促凝血酶原激酶抑制剂结构表征并以偏最小二乘(PLS)对3个体系建立定量构效关系(QSAR)模型, 得复相关系数(Rcum2)与交互检验复相关系数(Qcum2)分别为0.887和0.753; 0.995和0.708; 0.999和0.802. 研究结果表明, VSGETAWAY描述子操作简便、结构表达能力强, 有望成为多肽药物QSAR研究中一种有效的结构表征方法.  相似文献   

12.
从20 种天然氨基酸的171个物化性质出发, 按照疏水、立体和电性特征及氢键贡献将其分类后, 分别进行主成分分析, 得到一个新描述子VHSEH(Principal component score vector of hydrophobic, steric, electronic properties and, hydrogen bonds contributions). 对后叶催产素的结构进行了表征, 并以偏最小二乘法及D-优化划分样本建立了PLS定量序效关系模型, 得到复相关系数R2分别为 0.958 和 0.957, Q2分别为0.903和0.845, 约高于VHSE描述子模型值; 对抗菌肽进行了结构表征, 建立了PLS和OSC-PLS模型, 其R2分别为0.84和 0.995, Q2分别为0.546和0.926, 较SZOTT描述子结果好; 对58 个血管紧张素转化酶抑制剂进行QSAM研究, 得到R2, Q2及RMS分别为0.877, 0.838和0.361. 研究结果表明, VHSEH 描述子信息量大, 物化意义明确, 结果更易解释.  相似文献   

13.
Both the concept and the model of snug quantitative structure-activity relationship (QSAR) were pro-posed and developed for molecular design through constructing QSAR based on some known mode of receptor/ligand interactions. Many disadvantages of traditional models can be avoided by using the proposed method because the traditional models only determined upon molecular structural features in sample sets themselves. A genetic virtual screening of peptide/protein combinations (GVSPPC) is proposed for the first time by utilizing this idea to examine peptide/protein affinity activities. A genetic algorithm (GA) was developed for screening combinative targets with an interaction mode for virtual receptors. GVSPPC succeeds in disposing difficulties in rational QSAR,in order to search for the ligand/receptor interactions on conditions of unknown structures. Some bioactive oligo-/poly-peptide systems covering 58 angiotensin converting enzyme (ACE) inhibitors and 18 double site mutation residues in camel antibody protein cAb-Lys3 were investigated by GVSPPC with satisfactory results (R 2 cu>0.91,Q 2 cv > 0.86,ERMS=0.19-0.95),respectively,which demonstrates that GVSPPC is more inter-pretable in the ligand-receptor interaction than the traditional QSAR method.  相似文献   

14.
15.
16.
氨基酸描述子SZOTT用于多肽定量序效建模研究   总被引:1,自引:0,他引:1  
在相关研究的基础上, 提出一新的氨基酸描述子SZOTT, 该描述子所含信息量大, 且操作简便. 将其用于两类肽体系序列表征, 用偏最小二乘和正交信号纠正-偏最小二乘建模, 获得较好的建模结果.  相似文献   

17.
18.
19.
Interaction of dipropyltin(IV) with selected amino acids, peptides, dicarboxylic acids or DNA constituents was investigated using potentiometric techniques. Amino acids form 1?:?1 and 1?:?2 complexes and, in some cases, protonated complexes. The amino acid is bound to dipropyltin(IV) by the amino and carboxylate groups. Serine is complexed to dipropyltin(IV) with ionization of the alcoholic group. A relationship exists between the acid dissociation constant of the amino acids and the formation constants of the corresponding complexes. Dicarboxylic acids form both 1?:?1 and 1?:?2 complexes. Diacids forming five- and six-membered chelate rings are the most stable. Peptides form complexes with stoichiometric coefficients 111(MLH), 110(ML) and 11-1(MLH?1)(tin: peptide: H+). The mode of coordination is discussed based on existing data and previous investigations. DNA constituents inosine, adenosine, uracil, uridine, and thymine form 1?:?1 and 1?:?2 complexes and the binding sites are assigned. Inosine 5′-monophosphate, guanosine 5′-monophosphate, adenosine 5′-monophosphate and adenine form protonated species in addition to 1?:?1 and 1?:?2 complexes. The protonation sites and tin-binding sites were elucidated. Cytosine and cytidine do not form complexes with dipropyltin(IV) due to low basicity of the donor sites. The stepwise formation constants of the complexes formed in solution were calculated using the non-linear least-square program MINIQUAD-75. The concentration distribution of the various complex species was evaluated as a function of pH.  相似文献   

20.
A novel near infrared (NIR) modeling method—Laplacian regularized least squares regression (LapRLSR) was presented, which can take the advantage of many unlabeled spectra to promote the prediction performance of the model even if there are only few calibration samples. Using LapRLSR modeling, NIR spectral analysis was applied to the online monitoring of the concentration of salvia acid B in the column separation of Salvianolate. The results demonstrated that LapRLSR outperformed partial least squares (PLS) significantly, and NIR online analysis was applicable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号