首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The on‐line monitoring of batch processes based on principal component analysis (PCA) has been widely studied. Nonetheless, researchers have not paid so much attention to the on‐line application of partial least squares (PLS). In this paper, the influence of several issues in the predictive power of a PLS model for the on‐line estimation of key variables in a batch process is studied. Some of the conclusions can help to better understand the capabilities of the proposals presented for on‐line PCA‐based monitoring. Issues like the convenience of batch‐wise or variable‐wise unfolding, the method for the imputation of future measurements and the use of several sub‐models are addressed. This is the first time that the adaptive hierarchical (or multi‐block) approach is extended to the PLS modelling. Also, the formulation of the so‐called trimmed scores regression (TSR), a powerful imputation method defined for PCA, is extended for its application with PLS modelling. Data from two processes, one simulated and one real, are used to illustrate the results. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

2.
The well‐known Martens factorization for PLS1 produces a single y‐related score, with all subsequent scores being y‐unrelated. The X‐explanatory value of these y‐orthogonal scores can be summarized by a simple expression, which is analogous to the ‘P’ loading weights in the orthogonalized NIPALS algorithm. This can be used to rearrange the factorization into entirely y‐related and y‐unrelated parts. Systematic y‐unrelated variation can thus be removed from the X data through a single post hoc calculation following conventional PLS, without any recourse to the orthogonal projections to latent structures (OPLS) algorithm. The work presented is consistent with the development by Ergon (PLS post‐processing by similarity transformation (PLS + ST): a simple alternative to OPLS. J. Chemom. 2005; 19 : 1–4), which shows that conventional PLS and OPLS are equivalent within a similarity transform. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

3.
This paper presents a modified version of the NIPALS algorithm for PLS regression with one single response variable. This version, denoted a CF‐PLS, provides significant advantages over the standard PLS. First of all, it strongly reduces the over‐fit of the regression. Secondly, R2 for the null hypothesis follows a Beta distribution only function of the number of observations, which allows the use of a probabilistic framework to test the validity of a component. Thirdly, the models generated with CF‐PLS have comparable if not better prediction ability than the models fitted with NIPALS. Finally, the scores and loadings of the CF‐PLS are directly related to the R2, which makes the model and its interpretation more reliable. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

4.
Target projection (TP) also called target rotation (TR) was introduced to facilitate interpretation of latent‐variable regression models. Orthogonal partial least squares (OPLS) regression and PLS post‐processing by similarity transform (PLS + ST) represent two alternative algorithms for the same purpose. In addition, OPLS and PLS + ST provide components to explain systematic variation in X orthogonal to the response. We show, that for the same number of components, OPLS and PLS + ST provide score and loading vectors for the predictive latent variable that are the same as for TP except for a scaling factor. Furthermore, we show how the TP approach can be extended to become a hybrid of latent‐variable (LV) regression and exploratory LV analysis and thus embrace systematic variation in X unrelated to the response. Principal component analysis (PCA) of the residual variation after removal of the target component is here used to extract the orthogonal components, but X‐tended TP (XTP) permits other criteria for decomposition of the residual variation. If PCA is used for decomposing the orthogonal variation in XTP, the variance of the major orthogonal components obtained for OPLS and XTP is observed to be almost the same, showing the close relationship between the methods. The XTP approach is tested and compared with OPLS for a three‐component mixture analyzed by infrared spectroscopy and a multicomponent mixture measured by near infrared spectroscopy in a reactor. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

5.
The issue of outer model weight updating is important in extending partial least squares (PLS) regression to modelling data that shows significant non‐linearity. This paper presents a novel co‐evolutionary component approach to the weight updating problem. Specification of the non‐linear PLS model is achieved using an evolutionary computational (EC) method that can co‐evolve all non‐linear inner models and all input projection weights simultaneously. In this method, modular symbolic non‐linear equations are used to represent the inner models and binary sequences are used to represent the projection weights. The approach is flexible, and other representations could be employed within the same co‐evolutionary framework. The potential of these methods is illustrated using a simulated pH neutralisation process data set exhibiting significant non‐linearity. It is demonstrated that the co‐evolutionary component architecture can produce results which are competitive with non‐linear neural network‐based PLS algorithms that use iterative projection weight updating. In addition, a data sampling method for mitigating overfitting to the training data is described. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

6.
The complexity of metabolic profiles makes chemometric tools indispensable for extracting the most significant information. Partial least‐squares discriminant analysis (PLS‐DA) acts as one of the most effective strategies for data analysis in metabonomics. However, its actual efficacy in metabonomics is often weakened by the high similarity of metabolic profiles, which contain excessive variables. To rectify this situation, particle swarm optimization (PSO) was introduced to improve PLS‐DA by simultaneously selecting the optimal sample and variable subsets, the appropriate variable weights, and the best number of latent variables (SVWL) in PLS‐DA, forming a new algorithm named PSO‐SVWL‐PLSDA. Combined with 1H nuclear magnetic resonance‐based metabonomics, PSO‐SVWL‐PLSDA was applied to recognize the patients with lung cancer from the healthy controls. PLS‐DA was also investigated as a comparison. Relatively to the recognition rates of 86% and 65%, which were yielded by PLS‐DA, respectively, for the training and test sets, those of 98.3% and 90% were offered by PSO‐SVWL‐PLSDA. Moreover, several most discriminative metabolites were identified by PSO‐SVWL‐PLSDA to aid the diagnosis of lung cancer, including lactate, glucose (α‐glucose and β‐glucose), threonine, valine, taurine, trimethylamine, glutamine, glycoprotein, proline, and lipid. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

7.
The combination of unfolded partial least‐squares (U‐PLS) with residual bilinearization (RBL) provides a second‐order multivariate calibration method capable of achieving the second‐order advantage. RBL is performed by varying the test sample scores in order to minimize the residues of a combined U‐PLS model for the calibrated components and a principal component model for the potential interferents. The sample scores are then employed to predict the analyte concentration, with regression coefficients taken from the calibration step. When the contribution of multiple potential interferents is severe, particle swarm optimization (PSO) helps in preventing RBL to be trapped by false minima, restoring its predictive ability and making it comparable to the standard parallel factor (PARAFAC) analysis. Both simulated and experimental systems are analyzed in order to show the potentiality of the new technique. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

8.
In this paper, we proposed a wavelength selection method based on random decision particle swarm optimization with attractor for near‐infrared (NIR) spectra quantitative analysis. The proposed method was incorporated with partial least square (PLS) to construct a prediction model. The proposed method chooses the current own optimal or the current global optimal to calculate the attractor. Then the particle updates its flight velocity by the attractor, and the particle state is updated by the random decision with the new velocity. Moreover, the root‐mean‐square error of cross‐validation is adopted as the fitness function for the proposed method. In order to demonstrate the usefulness of the proposed method, PLS with all wavelengths, uninformative variable elimination by PLS, elastic net, genetic algorithm combined with PLS, the discrete particle swarm optimization combined with PLS, the modified particle swarm optimization combined with PLS, the neighboring particle swarm optimization combined with PLS, and the proposed method are used for building the components quantitative analysis models of NIR spectral datasets, and the effectiveness of these models is compared. Two application studies are presented, which involve NIR data obtained from an experiment of meat content determination using NIR and a combustion procedure. Results verify that the proposed method has higher predictive ability for NIR spectral data and the number of selected wavelengths is less. The proposed method has faster convergence speed and could overcome the premature convergence problem. Furthermore, although improving the prediction precision may sacrifice the model complexity under a certain extent, the proposed method is overfitted slightly. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

9.
Near‐infrared spectroscopy has been used in nutritional metabolomics fingerprinting for the assessment of the intake of intervention breakfasts prepared with four different vegetable oils that were previously subjected to a deep frying process of 20 cycles for 5 min at 180°C. The target oils were an extra virgin olive oil and three varieties of refined sunflower oil. Of the three latter, one of them was used as such, other was spiked with a synthetic oxidation inhibitor (dimethylsiloxane) and, finally, the last one was enriched with an extract of phenolic compounds from olive pomace, the antioxidant properties of which are well known. Urine sampled from individuals before intake and 2 and 4 h after intake was directly analyzed by NIRS to obtain fingerprint characteristics of the metabolome composition. The resulting urinary patterns were combined for statistical analysis by unsupervised and supervised approaches. Partial least squares‐class modeling enabled to develop class‐models for each intervention breakfast, thus achieving discrimination of urinary fingerprints from individuals after breakfast intake. The models were statistically characterized by estimation of sensitivity and specificity parameters for the training and evaluation (validation) steps. The application of variable importance in projection algorithm enabled to detect the spectral regions with higher significance to explain the variability observed in the partial least squares class‐models. Quantitative differences of variable importance in projection scores discriminated among the different classes under study. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

10.
《Analytical letters》2012,45(2):257-280
Abstract

A procedure for selection of wavelength range and number of factors to be used in partial least square calibration that involves the calculation of prediction residual sum of squares (PRESS) in different conditions is proposed. The best model takes into account the minimum PRESS value that does not show significant differences with respect to the corresponding model with fewer factors. The ability of the proposed method to minimize errors in partial least squares (PLS) prediction is demonstrated by applying it to the resolution of phenytoine (DPH) and phenobarbital (PB) binary mixtures with errors less than 2.8%; the results are compared with those obtained using another wavelength selection procedure. The ensuing method, which was validated by high performance liquid chromatography (HPLC), also gives good results with real samples (pharmaceutical preparations).  相似文献   

11.
In developing partial least squares calibration models, selecting the number of latent variables used for their construction to minimize both model bias and model variance remains a challenge. Several metrics exist for incorporating these trade‐offs, but the cost of model parsimony and the potential for underfitting on achievable prediction errors are difficult to anticipate. We propose a metric that penalizes growing model variance against decreasing bias as additional latent variables are added. The magnitude of the penalty is scaled by a user‐defined parameter that is formulated to provide a constraint on the fractional increase in root mean square error of cross‐validation (RMSECV) when selecting a parsimonious model over the conventional minimum RMSECV solution. We evaluate this approach for quantification of four organic functional groups using 238 laboratory standards and 750 complex atmospheric organic aerosol mixtures with mid‐infrared spectroscopy. Parametric variation of this penalty demonstrates that increase in prediction errors due to underfitting is bounded by the magnitude of the penalty for samples similar to laboratory standards used for model training and validation. Imposing an ensemble of penalties corresponding to a 0–30% allowable increase in RMSECV through sum of ranking differences leads to the selection of a model that increases the actual RMSECV up to 20% for laboratory standards but achieves an 85% reduction in the mean error in predicted concentrations for environmental mixtures. Partial least squares models developed with laboratory mixtures can provide useful predictions in complex environmental samples, but may benefit from protection against overfitting. © 2015 The Authors. Journal of Chemometrics published by John Wiley & Sons Ltd.  相似文献   

12.
以普通玉米籽粒为试验材料,在应用遗传算法结合偏最小二乘回归法对近红外光谱数据进行特征波长选择的基础上,应用偏最小二乘回归法建立了特征波长测定玉米籽粒中淀粉含量的校正模型.试验结果表明,基于11个特征波长所建立的校正模型,其校正误差(RMSEC)、交叉检验误差(RMSECV)和预测误差(RMSEP)分别为0.30%、0.35%和0.27%,校正数据集和独立的检验数据集的预测值与实际测定值之间的相关系数分别达到0.9279和0.9390,与全光谱数据所建立的预测模型相比,在预测精度上均有所改善,表明应用遗传算法和PLS进行光谱特征选择,能获得更简单和更好的模型,为玉米籽粒中淀粉含量的近红外测定和红外光谱数据的处理提供了新的方法与途径.  相似文献   

13.
In several scientific applications, data are generated from two or more diverse sources (views) with the goal of predicting an outcome of interest. Often it is the case that the outcome is not associated with any single view. However, the synergy of all measurements from each view may yield a more predictive classifier. For example, consider a drug discovery application in which individual molecules are described partially by several assay screens based on diverse profiles and partially by their chemical structural fingerprints. A common classification problem is to determine whether the molecule is associated with a particular disease. In this paper, a co‐training algorithm is developed to utilize data from diverse sources to predict the common class variable. Novel enhancements for variable importance, robustness to a mislabeled class variable, and a technique to handle unbalanced classes are applied to the motivating data set, highlighting that the approach attains strong performance and provides useful diagnostics for data analytic purposes. In addition, comparisons to a framework with data fusion using partial least squares (PLS) are also assessed on real data. An R package for performing the proposed approach is provided as Supporting information. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

14.
In the present study, boosting has been combined with partial least‐squares discriminant analysis (PLS‐DA) to develop a new pattern recognition method called boosting partial least‐squares discriminant analysis (BPLS‐DA). BPLS‐DA is implemented by firstly constructing a series of PLS‐DA models on the various weighted versions of the original calibration set and then combining the predictions from the constructed PLS‐DA models to obtain the integrative results by weighted majority vote. Coupled with near infrared (NIR) spectroscopy, BPLS‐DA has been applied to discriminate different kinds of tea varieties. As comparisons to BPLS‐DA, the conventional principal component analysis, linear discriminant analysis (LDA), and PLS‐DA have also been investigated. Experimental results have shown that the inter‐variety difference can be accurately and rapidly distinguished via NIR spectroscopy coupled with BPLS‐DA. Moreover, the introduction of boosting drastically enhances the performance of an individual PLS‐DA, and BPLS‐DA is a well‐performed pattern recognition technique superior to LDA. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

15.
《Electroanalysis》2005,17(10):915-918
The voltammetric behavior of isoniazid and hydrazine at an overoxidized polypyrrole modified glassy carbon electrode has been investigated. The obtained cyclic voltammograms showed that their oxidation peaks were overlapped and it is difficult to determine them individually from a mixture without separation. To overcome this limitation, a procedure was proposed for resolution of overlapped voltammetric signals from mixtures of isoniazid and hydrazine. In this procedure, genetic algorithm was used for the selection of potentials for partial least squares. A feed forward artificial neural network with back propagation error algorithm was used to process the nonlinear relationship between currents and concentrations of hydrazine and isoniazid. The proposed method was suitable for determination of isoniazid in pharmaceutical tablets and detection of hydrazine impurities in the same samples.  相似文献   

16.
In recent years the number of spectroscopic studies utilizing multivariate techniques and involving different laboratories has been dramatically increased. In this paper the protocol for calibration transfer of partial least square regression model between high‐resolution nuclear magnetic resonance (NMR) spectrometers of different frequencies and equipped with different probes was established. As the test system previously published quantitative model to predict the concentration of blended soy species in sunflower lecithin was used. For multivariate modelling piecewise direct standardization (PDS), direct standardization, and hybrid calibration were employed. PDS showed the best performance for estimating lecithin falsification regarding its vegetable origin resulting in a significant decrease in root mean square error of prediction from 5.0 to 7.3% without standardization to 2.9–3.2% for PDS. Acceptable calibration transfer model was obtained by direct standardization, but this standardization approach introduces unfavourable noise to the spectral data. Hybrid calibration is least recommended for high‐resolution NMR data. The sensitivity of instrument transfer methods with respect to the type of spectrometer, the number of samples and the subset selection was also discussed. The study showed the necessity of applying a proper standardization procedure in cases when multivariate model has to be applied to the spectra recorded on a secondary NMR spectrometer even with the same magnetic field strength. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

17.
A rapid method was developed and validated by ultra‐performance liquid chromatography–triple quadrupole mass spectroscopy with ultraviolet detection (UPLC‐UV‐MS) for simultaneous determination of paris saponin I, paris saponin II, paris saponin VI and paris saponin VII. Partial least squares discriminant analysis (PLS‐DA) based on UPLC and Fourier transform infrared (FT‐IR) spectroscopy was employed to evaluate Paris polyphylla var. yunnanensis (PPY) at different harvesting times. Quantitative determination implied that the various contents of bioactive compounds with different harvesting times may lead to different pharmacological effects; the average content of total saponins for PPY harvested at 8 years was higher than that from other samples. The PLS‐DA of FT‐IR spectra had a better performance than that of UPLC for discrimination of PPY from different harvesting times.  相似文献   

18.
Rotation ambiguity (RA) in multivariate curve resolution (MCR) is an undesirable case, when the physicochemical constraints are not sufficiently strong to provide a unique resolution of the data matrix of the mixtures into spectra and concentration profiles of individual chemical components. RA is often met in MCR of overlapped chromatographic peaks, kinetic and equilibrium data, and fluorescence two‐dimensional spectra. In case of RA, a single candidate solution has little practical value. So, the whole set of feasible solutions should be characterized somehow. It is a quite intricate task in a general case. In the present paper, a method was proposed to estimate RA with charged particle swarm optimization (cPSO), a population‐based algorithm. The criteria for updating the particles were modified, so that the swarm converged to the steady state, which spanned the set of feasible solutions. The performance of cPSO‐MCR was demonstrated on test functions, simulated datasets, and real‐world data. Good accordance of the cPSO‐MCR results with the analytical solutions (Borgen plots) was observed. cPSO‐MCR was also shown to be capable of estimating the strength of the constraints and of revealing RA in noisy data. As compared with analytical methods, cPSO‐MCR is simpler to implement, expands to more than three chemical compounds, is immune to noise, and can be easily adapted to virtually all types of constraints and objective functions (constraint based or residue based). cPSO‐MCR also provides natural visual information about the level of RA in spectra and concentration profiles, similar to the methods of two extreme solutions (e.g., MCR‐BANDS). Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

19.
Herein, electromembrane extraction was combined with ultraviolet spectrophotometry using a customized manifold for preconcentration and simultaneous determination of morphine, codeine, and papaverine in water and human urine samples. Absorption spectra of the extracts were recorded inside the lumen of the hollow fiber using two fiber optics connected to a miniature spectrophotometer. Partial least squares regression was applied to resolve the overlapped spectra of the analytes. Performance of the model was validated by an independent test set. Central composite design was applied to optimize the extraction parameters. The optimized extraction conditions are as follows; supporting liquid membrane: 2‐nitrophenyl octyl ether containing 15% v/v bis(2‐ethylhexyl) phosphate, applied voltage: 80 V, donor pH: 3.0, acceptor pH: 1.0, extraction time: 20 min. Finally, the optimized extraction method was validated for determination of the mentioned alkaloids in human urine samples. The method showed good linearity (R> 0.995) for all of the mentioned alkaloids. The limits of detection for morphine, codeine, and papaverine in diluted human urine were found to be 0.6, 1.1, and 0.6 ng/mL, respectively with acceptable relative standard deviations. Enrichment factors of 104, 108, and 102 were achieved for morphine, codeine, and papaverine, respectively.  相似文献   

20.
邵学广  陈达  徐恒  刘智超  蔡文生 《中国化学》2009,27(7):1328-1332
偏最小二乘法(PLS)在近红外光谱(NIR)定量分析中占有重要地位,但预测结果往往容易受到样本分组和奇异样本等因素的影响,稳健性不强。多模型PLS (EPLS)方法在模型稳健性上得到提高,然而它无法识别样本中存在的奇异样本。为了同时提高模型的预测准确性和稳健性,本文提出了一种根据取样概率重新取样的多模型PLS方法,称为稳健共识PLS(RE-PLS)方法。该方法通过迭代赋权偏最小二乘法(IRPLS)计算样本回归残差得到每个校正集样本的取样概率,然后根据样本的取样概率来选择训练子集建立多个PLS模型,最后将所有PLS模型的预测结果平均作为最终预测结果。该方法用于两种不同植物样品的近红外光谱建模,并与传统的PLS及EPLS方法进行比较。结果表明该方法可以有效的避免校正集中奇异样本对模型的影响,同时可以提高预测精确度和稳健性。对于含有较多奇异样本的,复杂近红外光谱烟草实际样本,利用简单PLS或者EPLS方法建模预测效果不是很理想,而RE-PLS凭借其独特优势则有望在这种复杂光谱定量分析中得到广泛的应用。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号