共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
4.
拓扑-量子指数醛酮气相色谱保留指数及沸点的定量构效关系 总被引:3,自引:0,他引:3
通过对醛酮化合物分子结构特征及其气相色谱保留指数(RI)和沸点与分子结构间关系的研究,提出了分子极化效应指数(MPEI)、奇偶指数(OEI)、立体效应指数(SVij)、顶点度-距离指数(VDI)及键连接矩阵特征根(∑X1CH)等拓扑-量子结构参数,用多元线性回归(MLR)方法获得了醛酮类化合物的沸点及其在不同极性色谱柱上的气相色谱保留指数与这些拓扑-量子指数间良好的定量结构-性质相关(QSPR)模型,相关系数均大于0.99。5个分子结构参数具有明确的物理化学意义且易于计算和运用。与文献研究的比较结果表明:由上述分子结构参数得出的模型方程适用于各类醛酮化合物的气相色谱保留指数及沸点的预测且具有较好的稳定性和准确性。 相似文献
5.
定量结构-活性/性质相关性研究中变量选择方法的研究——遗传算法和几种传统算法的比较 总被引:1,自引:0,他引:1
鉴于变量选择在 QSAR/QSPR研究中的重要性 ,比较了遗传算法和几种传统的方法 ,如前进法、后退法及逐步回归法 .结果表明 ,对于研究中所用数据 ,遗传算法较几种传统的方法为好 ,其原因可能由于传统的方法陷入了局部最优 .遗传算法在变量较多的情况下方可显示出效率高和得到较好结果的优越性 .对于变量的选择 ,遗传算法是一值得推荐的有效的方法 相似文献
6.
光谱样本数据常会受到环境噪声和其它组分的干扰,应作波长选择,以提高分析精度。近红外光谱谱区宽,搜索空间过大,难以直接采用遗传算法进行波长选择。为此本研究提出先用移动窗口偏最小二乘法(MWPLS)从宽谱区中初选出信息区间,再采用改进的迭代遗传算法(IGA)从中选出最优的信息子区间。MWPLS用移动窗口沿全谱区扫描,对信息区间的定位效果好,而IGA将顾及光谱数据的连续相关特性,运行多轮GA,并以上轮选择结果平滑处理后作为先验知识支持下轮的种群初始化。由此选出的连续相邻的波长点作为自变量,进行PLS建模,既可显著地简化模型,又保留一定的数据冗余,模型的稳健性好、分析精度高。将其用于小麦水分的近红外分析,效果良好,预测性能明显优于其它方法。 相似文献
7.
8.
9.
10.
以自组建的血管紧张素转化酶(AngiotensinI-convertingenzyme)抑制肽库为研究对象,采用氨基酸描述符SVHEHS(Scoresvectorofhydrophobic,electronic,hydrogenbondsandstericproperties)对各肽样本进行结构表征后,进行自交叉协方差(Autocrosscovariances,ACC)处理,并分别利用多元线性回归(Multiplelinearregression,MLR)、偏最小二乘(Partialleastsquareregression,PLS)、人工神经网络(Artificialneuralnetworks,ANN)3种建模方法进行ACE抑制肽QSAR建模。结果显示,所得MLR、PLS与ANN模型的相关系数(Correlationcoefficient,R2)分别为0.744、0.862、0.958,留一交叉验证相关系数(Leave-one-outcross-validatedcorrelationcoefficient,Q2LOO)分别为0.532、0.829、0.948,外部验证复相关系数(Externalvalidatedcorrelationcoefficient,Q2ext)分别为0.567、0.632、0.634。因此,SVHEHS结合上述3种建模方法均适用于ACE抑制肽的QSAR研究,其中ANN的建模效果最优。 相似文献
11.
以普通玉米籽粒为试验材料,在应用遗传算法结合偏最小二乘回归法对近红外光谱数据进行特征波长选择的基础上,应用偏最小二乘回归法建立了特征波长测定玉米籽粒中淀粉含量的校正模型.试验结果表明,基于11个特征波长所建立的校正模型,其校正误差(RMSEC)、交叉检验误差(RMSECV)和预测误差(RMSEP)分别为0.30%、0.35%和0.27%,校正数据集和独立的检验数据集的预测值与实际测定值之间的相关系数分别达到0.9279和0.9390,与全光谱数据所建立的预测模型相比,在预测精度上均有所改善,表明应用遗传算法和PLS进行光谱特征选择,能获得更简单和更好的模型,为玉米籽粒中淀粉含量的近红外测定和红外光谱数据的处理提供了新的方法与途径. 相似文献
12.
13.
14.
15.
16.
Huan ZHAO Ke-Wei HUAN Xiao-Guang SHI Feng ZHENG Li-Ying LIU Wei LIU Chun-Ying ZHAO 《分析化学》2018,46(1):136-142
Near-infrared spectroscopy (NIR) is widely used in food quantitative and qualitative analysis. Variable selection technique is a critical step of the spectrum modeling with the development of chemometrics. In this study, a novel variable selection strategy, automatic weighting variable combination population analysis (AWVCPA), is proposed. Firstly, binary matrix sampling (BMS) strategy, which provides each variable the same chance to be selected and generates different variable combinations, is used to produce a population of subsets to construct a population of sub-models. Then, the variable frequency (Fre) and partial least squares regression (Reg), two kinds of information vector (IVs), are weighted to obtain the value of the contribution of each spectral variables, and the influence of two IVs of Rre and Reg is considered to each spectral variable. Finally, it uses the exponentially decreasing function (EDF) to remove the low contribution wavelengths so as to select the characteristic variables. In the case of near infrared spectra of beer and corn, yeast and oil concentration models based on partial least squares (PLS) of prediction are established. Compared with other variable selection methods, the research shows that AWVCPA is the best variable selection strategy in the same situation. It has 72.7% improvement comparing AWVCPA-PLS to PLS and the predicted root mean square error (RMSEP) decreases from 0.5348 to 0.1457 on beer dataset. Also it has 64.7% improvement comparing AWVCPA-PLS to PLS and the RMSEP decreases from 0.0702 to 0.0248 on corn dataset. 相似文献
17.
18.
免疫-遗传算法用于混合物重叠核磁共振信号解析 总被引:5,自引:0,他引:5
通过对免疫系统中抗体对外来抗原的识别、消除等过程的模拟,建立了一种新型的免疫算法模型.将标样信号作为抗体,混合物重叠信号作为抗原输入免疫算法模型,通过迭代运算,从抗原中消除抗体所表示的信息,当抗原被抗体完全消除时,即实现了混合物重叠信号的解析.对多组分混合氨基酸NMR谱图的解析结果证明,该算法可方便地用于多组分重叠信号的解析,为利用数据库解析混合物或生物大分子等物质的复杂NMR谱图开辟了一条全新的途径. 相似文献
19.
“合理”QSAR模型是指在了解配体与受体相互作用模式的前提下建立定量构效关系, 这样避开了传统做法仅仅依靠样本集分子自身信息来构建预测模型的诸多弊端. 本文将此思想应用于肽/蛋白质亲和活性的研究当中, 借助于遗传算法作为虚拟受体结合靶点及相互作用模式的筛选手段得到了一种新的建模技术: 肽/蛋白质结合模式遗传虚拟筛选(genetic virtual screening of combinative mode for peptide/protein, GVSC). 该法成功解决了“合理”QSAR研究中的难题, 即大多数情况下受体结构未知而难以了解配基与之发生的结合方式. 分别使用58个血管紧张素转化酶, 18个Camel抗体蛋白cAb-lys3双位点突变残基对GVSC加以检验, 其结果表明GVSC能够较好地阐明配基与受体之间的作用机理, 并能得到优于传统方法的QSAR模型. 相似文献
20.
The selection abilities of the two well‐known techniques of variable selection, synergy interval‐partial least‐squares (SiPLS) and genetic algorithm‐partial least‐squares (GA‐PLS), have been examined and compared. By using different simulated and real (corn and metabolite) datasets, keeping in view the spectral overlapping of the components, the influence of the selection of either intervals of variables or individual variables on the prediction performances was examined. In the simulated datasets, with decrease in the overlapping of the spectra of components and cases with components of narrow bands, GA‐PLS results were better. In contrast, the performance of SiPLS was higher for data of intermediate overlapping. For mixtures of high overlapping analytes, GA‐PLS showed slightly better performance. However, significant differences between the results of the two selection methods were not observed in most of the cases. Although SiPLS resulted in slightly better performance of prediction in the case of corn dataset except for the prediction of the moisture content, the improvement obtained by SiPLS compared with that by GA‐PLS was not significant. For real data of less overlapped components (metabolite dataset), GA‐PLS that tends to select far fewer variables did not give significantly better root mean square error of cross‐validation (RMSECV), cross‐validated R2 (Q2), and root mean square error of prediction (RMSEP) compared with SiPLS. Irrespective of the type of dataset, GA‐PLS resulted in models with fewer latent variables (LVs). When comparing the computational time of the methods, GA‐PLS is considered superior to SiPLS. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献