首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
朱尔一  林燕  庄赞勇 《分析化学》2007,35(7):973-977
提出了一种新的偏最小二乘变量筛选方法,该方法利用PLS回归建模过程中的一些信息,删除一部分冗余的或对建模影响不大的变量来简化、优化预报模型。用此方法结合变量扩维方法处理云南昆明、思茅、西双版纳3个来源地缴获的244个海洛因样本的ICP-MS数据时,与传统的算法比较,模型的判别准确率得到大大提高,达到95%以上。且所得到的模型含变量少,很容易分析或解释各变量对模型的影响。因此该方法可用于对毒品来源有效的识别或鉴定。  相似文献   

2.
遗传算法用于变量筛选   总被引:3,自引:0,他引:3  
利用遗传算法的优越搜索寻优特性,结合有序Gram-Schmidt正文化及PLS算法可得到预报能力较强的模型,即PRESS(预报残差平方和)值较低的模型.该法可用于处理构效关系及人发微量元素与性别关系问题,并与正交递归选择法及逐步回归正向选择法进行比较,结果良好.  相似文献   

3.
多变量判别分析用于癌症诊断研究   总被引:9,自引:2,他引:9  
用感应耦合等离子体原子发射光谱及石墨炉原子吸收光谱仪测定了正常人及癌症病人头发样品中15种元素的含量。所得数据用多元多项式扩展增维和逐步回归变量压缩技术以及PLS方法处理,得到了病人与正常人分类极为清晰的二维判别图。据此可将头发用作癌症临床诊断中的分析样品以取代血液样品。  相似文献   

4.
木器漆碎片的检验鉴定是微量物证检验中一项重要工作,实验借助红外光谱分析技术和判别分析,对木器漆的品牌归属实现了高效、准确的鉴别。实验收集了北京梅菲特、天津裕北等18个品牌的78种木器漆样本的红外指纹光谱图,选择自动基线校正预处理谱图,分别对每一种品牌的全部样本数据进行降维处理,对原始数据与降维后数据进行判别分析建立分类模型。经降维处理后的样本数据判别准确率达100%,而仅使用Fisher判别的结果为81.8%,表明采用降维法对数据预先处理可以一定程度提高木器漆分类的准确性。结果表明,红外指纹光谱结合判别分析模型对不同品牌水性木器漆的样本进行识别是可行的,为水性木器漆种类的快速鉴别提供理论支持。  相似文献   

5.
该文开展了一种电感耦合等离子体质谱(ICP-MS)截尾数据和支持向量机(SVM)分类模型识别蜂蜜植物源的研究。实验选取荆条蜜、洋槐蜜、葵花蜜、油菜蜜4种不同植物源的蜂蜜共97例,经微波消解等预处理后,采用ICP-MS分别测得蜂蜜样品中16种金属元素的含量,并研究13种具有显著性差异的金属元素,以含截尾数据和不含截尾数据的元素作为输入变量分别建立基于高斯径向基函数的SVM分类模型,并通过网格搜索法(GS)、遗传算法(GA)、粒子群优化(PSO)算法对SVM模型中的惩罚参数c和核函数参数g进行优化。结果表明:Al、Ti、Cr、Ni、As、Se、Cd、Ba、Pb 9种金属元素存在截尾数据;方差分析结果表明,4种不同植物源蜂蜜之间,Na、Mg、Al、K、Ca、Mn、Ni、Cu、Zn、Se、Ba、Pb 12种金属元素在95%置信区间差异极显著,As元素在95%置信区间差异显著,Ti、Cr和Cd在95%置信区间无显著性差异,使用替换法将截尾数据按二分之一检出限值处理并作为输入变量时所建立的SVM模型分类效果更优;使用截尾数据所建立模型的判别正确率为91.8%,而不含截尾数据建立模型的判别正确率仅为82.5%。使用网格搜索法、遗传算法、粒子群优化算法对分类模型中惩罚参数c和核函数参数g作进一步优化,通过PSO算法寻优获得惩罚参数c为62.8,核函数参数g为1.26的条件下所建立的分类模型最优,其综合判别正确率为96.9%。由此可见,利用替换法按二分之一检出限值处理截尾数据作蜂蜜植物源鉴别分析是可行的,同时表明基于ICP-MS截尾数据结合SVM优化模型能提高模型判别正确率并可有效鉴别不同植物源蜂蜜。  相似文献   

6.
正交递归选择法及其应用   总被引:1,自引:0,他引:1  
本文提出一种新的变量筛选法-正交递归选择法,该法可以得到预报能力较强的模型,即PRESS(预报残差平方和)值较低的模型。用该法处理构效关系问题,并与逐步回归正向选择法及PLS回归法进行了比较,得到满意的结果。  相似文献   

7.
利用近红外光谱技术结合变量选择方法对食用油中高效氟吡甲禾灵残留进行定性检测研究。在4000~10000 cm-1光谱范围内采集114个食用油样本的近红外透射光谱。分别采用竞争自适应重加权法(Competitive Adaptive Reweighted Sampling,CARS)、子窗口重排分析(Subwindow Permutation Analysis,SPA)和蒙特卡罗无信息变量消除(Monte Carlo Uninformation Variable Elimination,MC-UVE)3种变量选择方法在全波段范围内筛选出与食用油中高效氟吡甲禾灵相关的重要变量,最后应用偏最小二乘-线性判别(Partial Least SquaresLinear Discriminant Analysis,PLS-LDA)方法分别对筛选后的特征波数变量建立食用油中高效氟吡甲禾灵残留的判别模型,并与常用定性判别方法的结果进行比较。研究结果表明,近红外光谱技术结合变量选择方法定性检测食用油中高效氟吡甲禾灵残留是可行的,且检测精度高。CARS-PLS-LDA方法所建立的判别模型性能最优,其预测集的正确率、灵敏度及特异性均为100.00%,且建模所用波数变量数最少,仅为全波段的0.82%。此外,CARS方法优于SPA及MC-UVE方法,但3种方法均能有效筛选关键变量,减少建模波数变量数,简化判别模型,提高判别模型的精度及稳定性。  相似文献   

8.
人工神经网络紫外分光光度法同时测定去痛片组分含量   总被引:8,自引:0,他引:8  
陈振宁 《分析化学》2001,29(11):1322-1324
用BP人神经网络处理复方制剂去痛片的紫外吸收光谱数据,达到了对其各组分含量进行同时测定的目的。通过对网络结构和参数的优化,提高了预报的准确度。  相似文献   

9.
采用近红外光谱漫反射模式结合化学计量学方法对稻米镉含量是否超标进行可行性鉴别分析.本研究收集了120个样本,测定其镉含量值(合格49个,不合格71个).对光谱数据预处理方法优化,确定了平滑,一阶导数以及自归一化后的数据作为输入变量.采用竞争性自适应重加权算法筛选了45个关键变量,并对上述变量的光谱吸收带进行归属.比较了主成分分析-判别分析法、偏最小二乘识别分析、线性判别分析、K-最近邻法与簇类独立软模式法5种模式识别方法.确定采用偏最小二乘识别分析建模效果最好,模型训练集与预测集鉴别准确率分别达到98.8%与91.7%.结果表明,近红外光谱作为初筛方法可用于鉴别稻米中镉含量是否超标.  相似文献   

10.
化学需氧量(Chemical Oxygen Demand,COD)是水体有机污染的一项重要指标,化学需氧量越高,表示水污染程度越严重。 为了解决传统的COD测量方法耗时较长,不利于快速、实时地获取水体中COD的信息等问题。本文提出了基于透射光谱测量结合主成分分析(Principal Component Analysis, PCA)改进水体COD含量估算模型。具体的,采集100组COD水体光谱信息,分别使用3种不同的高光谱数据预处理方法对光谱数据进行预处理,分析不同预处理方法对模型精度的影响,并基于不同的预处理方法分别建立高斯过程回归模型(Gaussian Process Regression, GPR)和BP神经网络模型,分析不同预处理方法对模型精度的影响;并对各模型结合PCA数据降维方法进行模型的改进,通过比较模型的精度选择最优模型进行水体COD含量的检测。结果显示,相比于原始光谱数据建立的GPR模型和BP神经网络模型,数据预处理后的模型精度明显提升;且结合PCA对预处理后的数据进一步降维处理后,模型精度得到了进一步的提升。其中,基于标准正态变量变换特征结合PCA改进BP神经网络模型基于PCA改进的BP神经网络模型R^2高达0.9940,均方根误差RMSE为0.022540。证明了基于PCA改进的BP神经网络数据降维方法对预处理后的光谱数据进行降维处理,有利于去除光谱中的冗余信息,提取特征信息,可以实现高光谱检测方法可以实现COD含量估算模型的优化,从而为传统COD测量方法存在的问题提出了一种新的解决思路。  相似文献   

11.
12.
Applying instrumental neutron activation analysis, multielement analysis of human hair was carried out to elucidate the levels of various trace element concentrations in hair of local population in the Tokyo metropolitan area. 202 hair samples were collected from the inhabitants classified by sex and five age groups. Using several combinations of irradiation time, cooling time and counting time, forty elements were quantitatively analyzed. The method of analysis for data including samples under detection limit is discussed, assuming that the frequanecy distribution of trace element contents in hair is log-normal.  相似文献   

13.
C Scherer  U Wachter  S A Wudy 《The Analyst》1998,123(12):2661-2663
A method for the determination of testosterone in human hair by gas chromatography-mass spectrometry using d3-testosterone as internal standard is described. Our method consisted of alkaline digestion, fast liquid-liquid extraction, LH-20 chromatography and derivatization with heptafluorobutyric anhydride. Quantification was achieved by selected ion monitoring of m/z 680 (testosterone) and m/z 683 (d3-testosterone). Our method needed no complex corrections for isotope contributions. The procedure provided a sensitive and specific technique with good accuracy and precision. For the first time, testosterone has been quantified by gas chromatography-mass spectrometry in human hair. The concentrations (median, range, ng g-1 hair) reflected a significant (p = 0.05; t-test) sex difference with 2.7 ng g-1 (2.5-4.2) in male and 1.7 ng g-1 (1.0-3.4) in female hair.  相似文献   

14.
Multivariate calibration problems often involve the identification of a meaningful subset of variables, from a vast number of variables for better prediction of output variables. A new graph theoretic method based on partial correlations (variable interaction network—VIN) is proposed. Many well studied representative calibration datasets spanning different application domains are selected for investigating the performance. Partial least squares (PLS) regression models combined with variable selection techniques are employed for benchmarking the performance. Subsets of variables with different number of variables are retained for the final analysis after VIN selection and progressive prediction accuracies are used for comparison. VIN-PLS results show significant improvement in prediction efficiencies and variable subset optimization. Improvement of up to 45% over existing methods with significantly fewer variables is achieved using the new method. Advantages of VIN based variable selection are highlighted.  相似文献   

15.
Glycerol monolaurate (GML) products contain many impurities, such as lauric acid and glucerol. The GML content is an important quality indicator for GML production. A hybrid variable selection algorithm, which is a combination of wavelet transform (WT) technology and modified uninformative variable eliminate (MUVE) method, was proposed to extract useful information from Fourier transform infrared (FT-IR) transmission spectroscopy for the determination of GML content. FT-IR spectra data were compressed by WT first; the irrelevant variables in the compressed wavelet coefficients were eliminated by MUVE. In the MUVE process, simulated annealing (SA) algorithm was employed to search the optimal cutoff threshold. After the WT-MUVE process, variables for the calibration model were reduced from 7366 to 163. Finally, the retained variables were employed as inputs of partial least squares (PLS) model to build the calibration model. For the prediction set, the correlation coefficient (r) of 0.9910 and root mean square error of prediction (RMSEP) of 4.8617 were obtained. The prediction result was better than the PLS model with full-spectra data. It was indicated that proposed WT-MUVE method could not only make the prediction more accurate, but also make the calibration model more parsimonious. Furthermore, the reconstructed spectra represented the projection of the selected wavelet coefficients into the original domain, affording the chemical interpretation of the predicted results. It is concluded that the FT-IR transmission spectroscopy technique with the proposed method is promising for the fast detection of GML content.  相似文献   

16.
Both present-day and historical head-hair samples up to 300 years old are being analysed by neutron activation for more than 30 trace elements. This study, designed to determine an historical base-line for the human intake of trace metals and to provide an evaluation of the present-day rate of increase and sources of environmental pollution, has direct forensic applicability. Modern samples being analysed in this study include hair from U. S. Naval Academy midshipmen and U.S. Air Force Academy cadets obtained upon arrival at the Academies in mid-1971 and again at later intervals during which trace-metal equilibration due to fixed diets and environmental conditions is presumed to occur. A wide variety of factors such as age, sex, hair structure and color, geographic location, general diet, socieconomic status are being considered in evaluating the analysis data. Examples of some of the initial data obtained from the analysis of the first three sets of Naval Academy midshipmen hair are presented.  相似文献   

17.
It is imperfect to evaluate a subsampling variable selection method using only its prediction performance. To further assess the reliability of subsampling variable selection methods, dummy noise variables of different amplitudes were augmented to the original spectral data, and the false variable selection number was recorded. The reliabilities of three subsampling variable selection methods including Monte Carlo uninformative variable elimination (MC‐UVE), competitive adaptive reweighted sampling (CARS), and stability CARS (SCARS) were evaluated using this dummy noise strategy. The evaluation results indicated that both CARS and SCARS produced more parsimonious variable sets, but the reliabilities of their final variable sets were weaker than those of MC‐UVE. On the contrary, only marginal improvement on the prediction performance was obtained using MC‐UVE. Further experiments showed that removing white noise‐like variables beforehand would improve the reliability of variables extracted by CARS and SCARS. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

18.
Hair samples were collected from 20 metallurgical workers (10 males and 10 females) and from 59 control subjects (32 males and 27 females), whose jobs do not indicate a specific occupational exposure. The concentrations of ten minor and trace elements (Al, Co, Cu, Fe, Mg, Mn, Sb, Se, V and Zn) were determined by instrumental neutron activation analysis (INAA). The statistical data distributions, the sex and age influences in these elemental concentrations and the average values obtained for the control group were compared with published data. The effect of occupational exposure to the metallic elements was reflected in elemental composition of hair by significant higher concentration levels of Al, Co, Cu, Fe, Mg, Mn, Sb, V and Zn in the hair of the exposed group, when compared with the control group.  相似文献   

19.
《Analytical letters》2012,45(4):671-681
A method was employed to determine enantiomeric excess (ee) value of chiral tert-butoxycarbonyl (Boc-protected) amino acids in a rapid way by using an infrared spectroscopy technique combined with wavelet packet transform (WPT) and least squares support vector machines (LS-SVM). Infrared spectral data were decomposed by using WPT algorithm. Simulated annealing (SA) algorithm was then used to search the optimal decomposed frequency band that had the greatest contribution to the quantitative analysis of ee values. As a result, the band (7, 34) with 34 variables and the band (5, 1) with 116 variables were determined as the optimal ones for the determination of Boc-protected proline and alanine, respectively. The selected variables in the optimal band were used as the inputs of LS-SVM models. The spectral variables selected by the WPT-SA method had lower predicting errors than full range spectra and the spectral variables selected by some traditional variable selection methods. Reasonable good results with root mean-square error of prediction (RMSEP) of 7.51 and 3.80 were obtained for the determination of ee values of two Boc-protected amino acids, showing that it is possible to rapidly determine ee values of amino acids by using IR spectroscopy rapidly.  相似文献   

20.
A new procedure with high ability to enhance prediction of multivariate calibration models with a small number of interpretable variables is presented. The core of this methodology is to sort the variables from an informative vector, followed by a systematic investigation of PLS regression models with the aim of finding the most relevant set of variables by comparing the cross‐validation parameters of the models obtained. In this work, seven main informative vectors i.e. regression vector, correlation vector, residual vector, variable influence on projection (VIP), net analyte signal (NAS), covariance procedures vector (CovProc), signal‐to‐noise ratios vector (StN) and their combinations were automated and tested with the main purpose of feature selection. Six data sets from different sources were employed to validate this methodology. They originated from: near‐Infrared (NIR) spectroscopy, Raman spectroscopy, gas chromatography (GC), fluorescence spectroscopy, quantitative structure‐activity relationships (QSAR) and computer simulation. The results indicate that all vectors and their combinations were able to enhance prediction capability with respect to the full data sets. However, regression and NAS informative vectors from partial least squares (PLS) regression, both built using more latent variables than when building the model presented in most of tested data sets, were the best informative vectors for variable selection. In all the applications, the selected variables were quite effective and useful for interpretation. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号