首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper introduces a technique to visualise the information content of the kernel matrix and a way to interpret the ingredients of the Support Vector Regression (SVR) model. Recently, the use of Support Vector Machines (SVM) for solving classification (SVC) and regression (SVR) problems has increased substantially in the field of chemistry and chemometrics. This is mainly due to its high generalisation performance and its ability to model non-linear relationships in a unique and global manner. Modeling of non-linear relationships will be enabled by applying a kernel function. The kernel function transforms the input data, usually non-linearly related to the associated output property, into a high dimensional feature space where the non-linear relationship can be represented in a linear form. Usually, SVMs are applied as a black box technique. Hence, the model cannot be interpreted like, e.g., Partial Least Squares (PLS). For example, the PLS scores and loadings make it possible to visualise and understand the driving force behind the optimal PLS machinery. In this study, we have investigated the possibilities to visualise and interpret the SVM model. Here, we exclusively have focused on Support Vector Regression to demonstrate these visualisation and interpretation techniques. Our observations show that we are now able to turn a SVR black box model into a transparent and interpretable regression modeling technique.  相似文献   

2.
3.
Quality control usually involves monitoring several variables directly related with industrial necessities using univariate tests. One powerful alternative is to link multivariate analytical techniques and multivariate chemometrics. In this way, Fourier Transform Infrared spectroscopy and Partial Least Squares regression are used to discuss and review several advantages and drawbacks encountered in using such combination in industrial facilities. Typical drawbacks are selection of data pretreatment, errors in reference methods, selection of calibration and validation sets and model-aging. This review is exemplified with petrochemical applications although other fields are also considered (mainly when dealing with data pretreatment).  相似文献   

4.
Andrade JM  Garcia MV  Lopez-Mahia P  Prada D 《Talanta》1997,44(12):2167-2184
Quality control usually involves monitoring several variables directly related with industrial necessities using univariate tests. One powerful alternative is to link multivariate analytical techniques and multivariate chemometrics. In this way, Fourier Transform Infrared spectroscopy and Partial Least Squares regression are used to discuss and review several advantages and drawbacks encountered in using such combination in industrial facilities. Typical drawbacks are selection of data pretreatment, errors in reference methods, selection of calibration and validation sets and model-aging. This review is exemplified with petrochemical applications although other fields are also considered (mainly when dealing with data pretreatment).  相似文献   

5.
A voltammetric sensor array (or electronic tongue) is developed for the simultaneous quantification of cysteine, glutathione and homocysteine without need of previous separation. It is based on the integration of three commercial screen‐printed electrodes (gold curated at high and low temperature and carbon modified with carbon nanotubes). Linear sweep voltammograms measured simultaneously by all three sensors are processed by Partial Least Squares (PLS) regression and different variables selection algorithms such as Genetic Algorithm and interval‐Partial Least Squares. The method was applied to synthetic mixtures and successfully validated, with correlation coefficients of prediction (Rp2) of 0.9542, 0.9429 and 0.9589 for cysteine, glutathione, and homocysteine respectively.  相似文献   

6.
燃料电池的机理模型及控制建模的研究   总被引:1,自引:0,他引:1  
根据直接甲醇燃料电池(DNIFC)的组成结构、工作原理,并运用电化学,流体动力学、热力学等学科理论,建立了DNIFC电池性能数学模型,并结合DNIFC实验数据进行仿真,结果表明这种数学建模是合理和有效的。由于数学模型的复杂性难以满足工程上对PNIFC控制系统的设计特别是实时控制需要的情况,本文提出一种基于最小二乘支持向量机建模算法,用具有RBF核函数的LS-SVM离线建立DNIFC电堆的非线性模型;仿真和实验结果表明了该建模方法具有建模简单、模型精度高等优点,亦证明了该算法的有效性和优越性。研究结果对直接甲醇燃料电池控制系统的建模和控制具有一定的实用价值。  相似文献   

7.
A new ensemble learning algorithm is presented for quantitative analysis of near-infrared spectra. The algorithm contains two steps of stacked regression and Partial Least Squares (PLS), termed Dual Stacked Partial Least Squares (DSPLS) algorithm. First, several sub-models were generated from the whole calibration set. The inner-stack step was implemented on sub-intervals of the spectrum. Then the outer-stack step was used to combine these sub-models. Several combination rules of the outer-stack step were analyzed for the proposed DSPLS algorithm. In addition, a novel selective weighting rule was also involved to select a subset of all available sub-models. Experiments on two public near-infrared datasets demonstrate that the proposed DSPLS with selective weighting rule provided superior prediction performance and outperformed the conventional PLS algorithm. Compared with the single model, the new ensemble model can provide more robust prediction result and can be considered an alternative choice for quantitative analytical applications.  相似文献   

8.
9.
Different classification methods (Partial Least Squares Discriminant Analysis, Extended Canonical Variates Analysis and Linear Discriminant Analysis), in combination with variable selection approaches (Forward Selection and Genetic Algorithms), were compared, evaluating their capabilities in the geographical discrimination of wine samples. Sixty‐two samples were analysed by means of dynamic headspace gas chromatography mass spectrometry (HS‐GC‐MS) and the entire chromatographic profile was considered to build the dataset. Since variable selection techniques pose a risk of overfitting when a large number of variables is used, a method for coupling data dimension reduction and variable selection was proposed. This approach compresses windows of the original data by retaining only significant components of local Principal Component Analysis models. The subsequent variable selection is then performed on these locally derived score variables. The results confirmed that the classification models achieved on the reduced data were better than those obtained on the entire chromatographic profile, with the exception of Extended Canonical Variates Analysis, which gave acceptable models in both cases. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

10.
Mass spectral classifiers of 16 substructures that are present in basic structures of pesticides have been investigated to assist pesticide residues analysis as well as screening of pesticide lead compounds. Mass spectral data are first transformed into 396 features, and then Genetic Algorithm-Partial Least Squares (GA-PLS) as a feature selection method and Support Vector Machine (SVM) as a validation method are implemented together to get an optimization feature set for each substructure. At last, a statistical method which is AdaBoost algorithm combined with Classification and Regression Tree (AdaBoost-CART) is trained to predict the 16 substructures presence/absence using the optimization mass spectral feature set. It is demonstrated that the optimum feature sets can be used to predict the 16 pesticide substructures presence/absence with mostly 85-100% in recognition success rate instead of the original 396 features.  相似文献   

11.
Different calibration methods have been applied for the determination of the Hydroxyl Number in polyester resins, namely Partial Least Squares (PLS), Principal Component Regression (PCR), Ordinary Least Squares with selection of the variables by genetic algorithm (OLS-GEN) and back-propagation Artificial Neural Networks (BP-ANN). The predictive ability of the regression models was estimated by splitting the dataset in training and test sets by application of the Kohonen self-organising maps. The linear methods (OLS-GEN, PLS and PCR) showed comparable results while artificial neural networks provided the best results both in fitting and prediction.  相似文献   

12.
13.
应用近红外光谱法(NIRS)建立木薯中淀粉、水分定量分析的近红外光谱数学模型,探讨了修正偏最小二乘法(MPLS)、偏最小二乘法(PLS)以及主成分回归法(PCR)等优化处理对定标模型的影响,确定了修正偏最小二乘法(MPLS)是建立模型最适合的数学方法。并对模型预测结果的准确性进行了评价。结果表明:验证集样品的化学值和近红外预测值拟合存在较好的线性关系,相关性显著。淀粉模型预测标准偏差(Sep)为0.850,系统偏差(Bias)为-0.095,相关系数(r)为0.971。水分模型预测标准偏差(Sep)为0.075,系统偏差(Bias)为0.007,相关系数(r)为0.980。淀粉、水分定量分析的NIRS数学模型具有较高的预测准确性,可应用于木薯批量收购中的品质等分析。  相似文献   

14.
The performance of Partial Least Squares regression (PLS) in predicting the output with multivariate cross‐ and autocorrelated data is studied. With many correlated predictors of varying importance PLS does not always predict well and we propose a modified algorithm, Partitioned Partial Least Squares (PPLS). In PPLS the predictors are partitioned into smaller subgroups and the important subgroups with high prediction power are identified. Finally, regular PLS analysis using only those subgroups is performed. The proposed Partitioned PLS (PPLS) algorithm is used in the analysis of data from a real pharmaceutical batch fermentation process for which the process variables follow certain profiles during a specific fermentation period. We observed that PPLS leads to a more accurate prediction of the yield of the fermentation process and an easier interpretation, since fewer predictors are used in the final PLS prediction. In the application important issues such as alignment of the profiles from one batch to another and standardization of the predictors are also addressed. For instance, in PPLS noise magnification due to standardization does not seem to create problems as it might in regular PLS. Finally, PPLS is compared to several recently proposed functional PLS and PCR methods and a genetic algorithm for variable selection. More specifically for a couple of publicly available data sets with near infrared spectra it is shown that overall PPLS has lower cross‐validated error than PLS, PCR and the functional modifications hereof, and is similar in performance to a more complex genetic algorithm. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

15.
16.
The present work studies the effectiveness of the use of triacylglycerols (TAGs) for the quantification of olive oil in blends with vegetable oils. The determinations were obtained using high-performance liquid chromatography (HPLC) coupled to a Charged Aerosol Detector (CAD), in combination with Partial Least Squares (PLS) regression and using interval PLS (iPLS) for variable selection.Results revealed that PLS models can predict olive oil concentrations with reasonable errors. Variable selection through iPLS did not improve predictions significantly, but revealed the chemical information important in the chromatogram to quantify olive oil in vegetable oil blends.  相似文献   

17.
《Analytical letters》2012,45(12):1713-1723
The concentrations of three industrial-grade textile dyes were determined in a mixture after degradation by the fungus Ganoderma sp, by using the methods of UV-Vis spectrophotometry associated with Partial Least Squares regression and HPLC and comparing the results obtained from both methods. Using the concentrations calculated from the two methods, a kinetic study of the biodegradation mediated by the fungus was performed. The rate constants and the activation energies for this transformation were obtained for each dye in the mixture. The concentration of Remazol Blue R ESP could be determined by the HPLC method, and the value obtained was comparable with the result using the Partial Least Squares regression method. The Partial Least Squares regression method offers advantages over the HPLC method for the quantification of dyes in textile effluents, as it provides the kinetic parameters of the biodegradation reaction.  相似文献   

18.
19.
Wei Z  Wang J 《Analytica chimica acta》2011,694(1-2):46-56
A voltammetric electronic tongue (VE-tongue) was developed to detect antibiotic residues in bovine milk. Six antibiotics (Chloramphenicol, Erythromycin, Kanamycin sulfate, Neomycin sulfate, Streptomycin sulfate and Tetracycline HCl) spiked at four different concentration levels (0.5, 1, 1.5 and 2 maximum residue limits (MRLs)) were classified based on VE-tongue by two pattern recognition methods: principal component analysis (PCA) and discriminant function analysis (DFA). The VE-tongue was composed of five working electrodes (gold, silver, platinum, palladium, and titanium) positioned in a standard three-electrode configuration. The Multi-frequency large amplitude pulse voltammetry (MLAPV) which consisted of four segments (1 Hz, 10 Hz, 100 Hz and 1000 Hz) was applied as potential waveform. The six antibiotics at the MRLs could not be separated from bovine milk completely by PCA, but all the samples were demarcated clearly by DFA. Three regression models: Principal Component Regression Analysis (PCR), Partial Least Squares Regression (PLSR), and Least Squares-Support Vector Machines (LS-SVM) were used for concentrations of antibiotics prediction. All the regression models performed well, and PCR had the most stable results.  相似文献   

20.
MATLAB语言在光谱定量分析中的应用   总被引:2,自引:0,他引:2  
利用MATLAB语言实验紫外-可见吸收光谱法和近红外漫反射光谱法的定量分析数据的处理,着重阐述了偏最小二乘法的多元校正过程。该方法简便、实用,简化并优化了计算过程,效率高,数值稳定性好。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号