首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
为提高毒死蜱农药乳油中有效成分近红外光谱定量分析模型的精度和稳定性。采用联合区间偏最小二乘法(siPLS)结合遗传算法(GA)筛选特征变量,由交互验证法确定最佳主成分因子数及筛选的变量数。结果表明,从全光谱区优选出81个变量,主成分因子数为11时,能建立性能最优的模型,模型预测集的决定系数R_p~2为0.972,预测均方根误差(RMSEP)为0.353%。研究表明,利用siPLS结合GA方法优选特征变量,能大幅度地消除农药乳油光谱变量间的冗余信息和无关信息,降低模型的复杂度,提高农药有效成分预测模型的精度及稳定性。  相似文献   

2.
模拟退火算法与遗传算法结合用于变量筛选   总被引:4,自引:0,他引:4  
章元  朱尔一  李静  庄峙厦 《分析化学》1999,27(10):1131-1135
在传统的遗传算法中引入Metropolis接受准则,结合有序Gram-Schmidt正交化,可以得到预报能力较强的模型,即PRESS值较低的模型。该法用于处理钢中微量元素及热处理工艺条件与钢的力学性质关系的问题,并与传统的遗传算法进行了比较,得到满意的结果。  相似文献   

3.
Widely used regression approaches in modeling quantitative structure-property relationships, such as PLS regression, are highly susceptible to outlying observations that will impair the prognostic value of a model. Our aim is to compile homogeneous datasets as the basis for regression modeling by removing outlying compounds and applying variable selection. We investigate different approaches to create robust, outlier-resistant regression models in the field of prediction of drug molecules' permeability. The objective is to join the strength of outlier detection and variable elimination increasing the predictive power of prognostic regression models. In conclusion, outlier detection is employed to identify multiple, homogeneous data subsets for regression modeling.  相似文献   

4.
A new heuristic and parallel simulated annealing algorithm was proposed for variable selection in near‐infrared spectroscopy analysis. The algorithm employs a parallel mechanism to enhance the search efficiency, a heuristic mechanism to generate high‐quality candidate solutions, and the concept of Metropolis criterion to estimate accuracy of the candidate solutions. Several near‐infrared datasets have been evaluated under the proposed new algorithm, with partial least squares leading to improved analytical figures of merit upon wavelength selection. Improved robust and predictive regression models were obtained by the new algorithm. The method could also be helpful in other chemometric activities such as classification or quantitative structure‐activity relationship problems.  相似文献   

5.
In this study,different methods of variable selection using the multilinear step-wise regression(MLR) and support vector regression(SVR) have been compared when the performance of genetic algorithms(GAs) using various types of chromosomes is used.The first method is a GA with binary chromosome(GA-BC) and the other is a GA with a fixed-length character chromosome(GA-FCC).The overall prediction accuracy for the training set by means of 7-fold cross-validation was tested.All the regression models were evaluated by the test set.The poor prediction for the test set illustrates that the forward stepwise regression(FSR) model is easier to overfit for the training set.The results using SVR methods showed that the over-fitting could be overcome.Further,the over-fitting would be easier for the GA-BC-SVR method because too many variables fleetly induced into the model.The final optimal model was obtained with good predictive ability(R2 = 0.885,S = 0.469,Rcv2 = 0.700,Scv = 0.757,Rex2 = 0.692,Sex = 0.675) using GA-FCC-SVR method.Our investigation indicates the variable selection method using GA-FCC is the most appropriate for MLR and SVR methods.  相似文献   

6.
7.
The topic of this paper is regression models based on designed experiments, where additional spectroscopic measurements are also available. This particular case describes a situation with two spectral blocks with no natural order: The blocks are parallel. Three methods are described, which combine least squares regression of the design variables with PCA or PLS on the spectra. The methods properties are explored in two simulation studies based on real experiments. The results show that the methods are equal when it comes to prediction, but interpretability varies. One of the methods, LS‐ParPLS, is especially interesting when it comes to interpretability because it splits the spectral information into two parts; information that is common in both blocks and information that is unique for each block. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

8.
基于群体智能的灰狼优化(GWO)算法具有参数少、结构简单、易于实现的优点,但在光谱领域的应用较少。该研究将GWO算法引入近红外光谱的变量筛选中,以玉米数据为例,考察了GWO算法中狼群性能、迭代次数、狼群数量及运算效率,并建立了偏最小二乘(PLS)模型对玉米样品中蛋白质、脂肪、水分以及淀粉含量的测定。结果显示,GWO算法运算效率很高,经过参数调优后建立PLS模型,其蛋白质、脂肪、水分及淀粉的保留变量数分别为19、19、14、34,预测均方根误差(RMSEP)从全波长PLS建模的0.245 8、0.122 4、0.339 8、1.105 8分别下降到0.147 7、0.080 1、0.176 2、0.739 8,分别下降了40%、35%、48%、33%,相关系数也相应地提高。因此,GWO算法不仅优化速度快,选择变量数少,还可以显著提高PLS模型的预测精度,是一种近红外光谱变量选择的有效方法。  相似文献   

9.
束志恒  方士  陈德钊  陈亚秋 《分析化学》2003,31(10):1169-1172
采用贝叶斯正则化方法训练,以得到推广性优良的神经网络,并提出启发性的遗传算法。通过灵敏度分析对正则化网络实施剪枝,从而在高维模式中筛选出能代表其分类特性的最小最优属性特征子集。此方法应用于高维留兰香模式的属性筛选与模式分类,效果良好,明显优于其它方法。  相似文献   

10.
光谱样本数据常会受到环境噪声和其它组分的干扰,应作波长选择,以提高分析精度。近红外光谱谱区宽,搜索空间过大,难以直接采用遗传算法进行波长选择。为此本研究提出先用移动窗口偏最小二乘法(MWPLS)从宽谱区中初选出信息区间,再采用改进的迭代遗传算法(IGA)从中选出最优的信息子区间。MWPLS用移动窗口沿全谱区扫描,对信息区间的定位效果好,而IGA将顾及光谱数据的连续相关特性,运行多轮GA,并以上轮选择结果平滑处理后作为先验知识支持下轮的种群初始化。由此选出的连续相邻的波长点作为自变量,进行PLS建模,既可显著地简化模型,又保留一定的数据冗余,模型的稳健性好、分析精度高。将其用于小麦水分的近红外分析,效果良好,预测性能明显优于其它方法。  相似文献   

11.
Abstract

Quantitative structure-activity relationship (QSAR) studies based on chemometric techniques are reviewed. Partial least squares (PLS) is introduced as a novel robust method to replace classical methods such as multiple linear regression (MLR). Advantages of PLS compared to MLR are illustrated with typical applications. Genetic algorithm (GA) is a novel optimization technique which can be used as a search engine in variable selection. A novel hybrid approach comprising GA and PLS for variable selection developed in our group (GAPLS) is described. The more advanced method for comparative molecular field analysis (CoMFA) modeling called GA-based region selection (GARGS) is described as well. Applications of GAPLS and GARGS to QSAR and 3D-QSAR problems are shown with some representative examples. GA can be hybridized with nonlinear modeling methods such as artificial neural networks (ANN) for providing useful tools in chemometric and QSAR.  相似文献   

12.
In this work, the unit cell parameter (a) of the series of cubic ABX3 perovskites was modeled using counter‐propagation artificial neural networks, and the influence of different input variables was examined by using algorithm for automatic adjustment of the relative importance of the variables. The input variables used in this model were the ionic radii of A, B, and X as well as the oxidation state (z) and the electronegativity (χ) of the anion. The developed models have good generalization performances—good agreement between experimental and predicted values for lattice parameter. One of the important outcomes from this work is obtained from the results of the automatic adjustment of the relative importance of input variables. That is to say, this analysis gave us an insight that the most pronounced influence on the successful prediction of the unit cell parameter of the analyzed data set of cubic ABX3 perovskites has the effective ionic radii of B‐cation. In addition to this, it may be concluded that the separation of the compounds in different regions of counter‐propagation artificial neural networks was predominantly influenced by the input variables with regard to the physical parameters of the anion. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

13.
Variable selection using a genetic algorithm is combined with partial least squares (PLS) for the prediction of additive concentrations in polymer films using Fourier transform-infrared (FT-IR) spectral data. An approach using an iterative application of the genetic algorithm is proposed. This approach allows for all variables to be considered and at the same time minimizes the risk of overfitting. We demonstrate that the variables selected by the genetic algorithm are consistent with expert knowledge. This very exciting result is a convincing application that the algorithm can select correct variables in an automated fashion.  相似文献   

14.
Abstract

This article presents a self-organising multilayered iterative algorithm that provides linear and non-linear polynomial regression models thus allowing the user to control the number and the power of the terms in the models. The accuracy of the algorithm is compared to the partial least squares (PLS) algorithm using fourteen data sets in quantitative-structure activity relationship studies. The calculated data show that the proposed method is able to select simple models characterized by a high prediction ability and thus provides a considerable interest in quantitative-structure activity relationship studies. The software is developed using client-server protocol (Java and C++ languages) and is available for world-wide users on the Web site of the authors.  相似文献   

15.
An algorithm is proposed for extracting relevant information from near-infrared (NIR) spectra for multivariate calibration of routine components in complex plant samples. The algorithm is a combination of wavelet transform (WT) data compression and a procedure for uninformative variable elimination (UVE). After compression of the NIR spectra by WT, the UVE approach is used to eliminate the irrelevant wavelet coefficients. Finally, a calibration model is built from the retained wavelet coefficients to enable prediction. Because irrelevant information can be removed from the spectra used for multivariate calibration, the model based on the extracted relevant features is better than those obtained with full-spectrum data. Both prediction precision and calculation speed are improved.  相似文献   

16.
陶焕明  高美凤 《分析测试学报》2021,40(10):1482-1488
该文在免疫遗传算法(IGA)的基础上,提出一种改进免疫遗传算法(iIGA)用于近红外光谱波长变量的选择。该算法舍去了原算法中固定抗体相似度阈值的思想,取而代之的是抗体相似度阈值自适应,同时引入精英保留策略和贪心算法思想,使得算法朝着正确的方向进行局部性探优。将该算法在玉米的淀粉和蛋白质含量数据集上进行实验测试,建立偏最小二乘(PLS)分析模型,并与IGA、遗传算法(GA)以及全谱方法进行了对比。结果表明,在玉米淀粉含量的预测上,iIGA相较于原IGA算法,预测集均方根误差(RMSEP)从0.312 0降至0.298 0,预测集预测精度提升4.5%;在玉米蛋白质含量的预测上,RMSEP从0.124 4降至0.110 3,预测集预测精度提升11.3%。分别对预测淀粉和蛋白质模型的RMSEP值进行显著性检验,F值分别为165.22和182.05,P值分别为9.5 × 10-23和4.5 × 10-24,P值均小于0.05,因此,iIGA能显著提升模型预测精度。  相似文献   

17.
18.
以普通玉米籽粒为试验材料,在应用遗传算法结合偏最小二乘回归法对近红外光谱数据进行特征波长选择的基础上,应用偏最小二乘回归法建立了特征波长测定玉米籽粒中淀粉含量的校正模型.试验结果表明,基于11个特征波长所建立的校正模型,其校正误差(RMSEC)、交叉检验误差(RMSECV)和预测误差(RMSEP)分别为0.30%、0.35%和0.27%,校正数据集和独立的检验数据集的预测值与实际测定值之间的相关系数分别达到0.9279和0.9390,与全光谱数据所建立的预测模型相比,在预测精度上均有所改善,表明应用遗传算法和PLS进行光谱特征选择,能获得更简单和更好的模型,为玉米籽粒中淀粉含量的近红外测定和红外光谱数据的处理提供了新的方法与途径.  相似文献   

19.
《Electroanalysis》2005,17(10):915-918
The voltammetric behavior of isoniazid and hydrazine at an overoxidized polypyrrole modified glassy carbon electrode has been investigated. The obtained cyclic voltammograms showed that their oxidation peaks were overlapped and it is difficult to determine them individually from a mixture without separation. To overcome this limitation, a procedure was proposed for resolution of overlapped voltammetric signals from mixtures of isoniazid and hydrazine. In this procedure, genetic algorithm was used for the selection of potentials for partial least squares. A feed forward artificial neural network with back propagation error algorithm was used to process the nonlinear relationship between currents and concentrations of hydrazine and isoniazid. The proposed method was suitable for determination of isoniazid in pharmaceutical tablets and detection of hydrazine impurities in the same samples.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号