首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The complexity of metabolic profiles makes chemometric tools indispensable for extracting the most significant information. Partial least‐squares discriminant analysis (PLS‐DA) acts as one of the most effective strategies for data analysis in metabonomics. However, its actual efficacy in metabonomics is often weakened by the high similarity of metabolic profiles, which contain excessive variables. To rectify this situation, particle swarm optimization (PSO) was introduced to improve PLS‐DA by simultaneously selecting the optimal sample and variable subsets, the appropriate variable weights, and the best number of latent variables (SVWL) in PLS‐DA, forming a new algorithm named PSO‐SVWL‐PLSDA. Combined with 1H nuclear magnetic resonance‐based metabonomics, PSO‐SVWL‐PLSDA was applied to recognize the patients with lung cancer from the healthy controls. PLS‐DA was also investigated as a comparison. Relatively to the recognition rates of 86% and 65%, which were yielded by PLS‐DA, respectively, for the training and test sets, those of 98.3% and 90% were offered by PSO‐SVWL‐PLSDA. Moreover, several most discriminative metabolites were identified by PSO‐SVWL‐PLSDA to aid the diagnosis of lung cancer, including lactate, glucose (α‐glucose and β‐glucose), threonine, valine, taurine, trimethylamine, glutamine, glycoprotein, proline, and lipid. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

2.
Damiani PC 《Talanta》2011,85(3):1526-1534
A second-order multivariate calibration method based on a combination of unfolded partial least-squares (U-PLS) with residual bilinearization (RBL) has been applied to second-order data obtained from excitation-emission fluorescence matrices for determining atenolol in human urine, even in the presence of background interactions and fluorescence inner filter effects, which are both sample dependent. Atenolol is a cardioselective beta-blocker, which is considered a doping agent in shoot practice, so that its determination in urine can be required for monitoring the drug. Loss of trilinearity due to analyte-background interactions which may vary between samples, as well as inner filter effects, precludes the use of methods like parallel factor analysis (PARAFAC) that cannot handle trilinearity deviations, and justifies the employment of U-PLS. Successful analysis required to include the background in the calibration set. Unexpected components appear in new urine samples, different from those used in calibration set, requiring the second-order advantage which is obtained from a separate procedure known as residual bilinearization (RBL). Satisfactory results were obtained for artificially spiked urines, and also for real urine samples. They were statistically compared with those obtained applying a reference method based on high-performance liquid chromatography (HPLC).  相似文献   

3.
In this paper, we proposed a wavelength selection method based on random decision particle swarm optimization with attractor for near‐infrared (NIR) spectra quantitative analysis. The proposed method was incorporated with partial least square (PLS) to construct a prediction model. The proposed method chooses the current own optimal or the current global optimal to calculate the attractor. Then the particle updates its flight velocity by the attractor, and the particle state is updated by the random decision with the new velocity. Moreover, the root‐mean‐square error of cross‐validation is adopted as the fitness function for the proposed method. In order to demonstrate the usefulness of the proposed method, PLS with all wavelengths, uninformative variable elimination by PLS, elastic net, genetic algorithm combined with PLS, the discrete particle swarm optimization combined with PLS, the modified particle swarm optimization combined with PLS, the neighboring particle swarm optimization combined with PLS, and the proposed method are used for building the components quantitative analysis models of NIR spectral datasets, and the effectiveness of these models is compared. Two application studies are presented, which involve NIR data obtained from an experiment of meat content determination using NIR and a combustion procedure. Results verify that the proposed method has higher predictive ability for NIR spectral data and the number of selected wavelengths is less. The proposed method has faster convergence speed and could overcome the premature convergence problem. Furthermore, although improving the prediction precision may sacrifice the model complexity under a certain extent, the proposed method is overfitted slightly. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

4.
Although a number of algorithms have established to obtain the well‐known second‐order advantage that quantifies analytes of interest in the presence of interferents, each has associated problems. In this work, for the first time, the optimization procedure of trilinear decomposition has been divided into three subparts, and a novel strategy is developed for assembling the advantages of the alternating trilinear decomposition (ATLD) algorithm, the self‐weighted alternating trilinear decomposition (SWATLD) algorithm, and the parallel factor analysis (PARAFAC) algorithm. The performance of the proposed strategy was evaluated using a simulated data set, a published fluorescence data set together with a new fluorescence data set that simultaneously quantifies procaine and tetracaine in plasma. Results show that the novel method can accurately and effectively estimate the qualitative and quantitative information of analytes of interest. Besides, the resolved profiles are very stable with respect to the number of components as long as the employed number is chosen to be equal or larger than the underlying one. Additionally, the study confirms that better prediction can be obtained by the new strategy when compared with ATLD, SWATLD, and PARAFAC as well as the strategy that employs direct trilinear decomposition method as initial values for PARAFAC. Moreover, the strategy can be directly extended to third‐order or higher‐order data analysis. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

5.
In the present study, boosting has been combined with partial least‐squares discriminant analysis (PLS‐DA) to develop a new pattern recognition method called boosting partial least‐squares discriminant analysis (BPLS‐DA). BPLS‐DA is implemented by firstly constructing a series of PLS‐DA models on the various weighted versions of the original calibration set and then combining the predictions from the constructed PLS‐DA models to obtain the integrative results by weighted majority vote. Coupled with near infrared (NIR) spectroscopy, BPLS‐DA has been applied to discriminate different kinds of tea varieties. As comparisons to BPLS‐DA, the conventional principal component analysis, linear discriminant analysis (LDA), and PLS‐DA have also been investigated. Experimental results have shown that the inter‐variety difference can be accurately and rapidly distinguished via NIR spectroscopy coupled with BPLS‐DA. Moreover, the introduction of boosting drastically enhances the performance of an individual PLS‐DA, and BPLS‐DA is a well‐performed pattern recognition technique superior to LDA. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

6.
The Poisson‐Boltzmann equation is an important tool in modeling solvent in biomolecular systems. In this article, we focus on numerical approximations to the electrostatic potential expressed in the regularized linear Poisson‐Boltzmann equation. We expose the flux directly through a first‐order system form of the equation. Using this formulation, we propose a system that yields a tractable least‐squares finite element formulation and establish theory to support this approach. The least‐squares finite element approximation naturally provides an a posteriori error estimator and we present numerical evidence in support of the method. The computational results highlight optimality in the case of adaptive mesh refinement for a variety of molecular configurations. In particular, we show promising performance for the Born ion, Fasciculin 1, methanol, and a dipole, which highlights robustness of our approach. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010  相似文献   

7.
Rotation ambiguity (RA) in multivariate curve resolution (MCR) is an undesirable case, when the physicochemical constraints are not sufficiently strong to provide a unique resolution of the data matrix of the mixtures into spectra and concentration profiles of individual chemical components. RA is often met in MCR of overlapped chromatographic peaks, kinetic and equilibrium data, and fluorescence two‐dimensional spectra. In case of RA, a single candidate solution has little practical value. So, the whole set of feasible solutions should be characterized somehow. It is a quite intricate task in a general case. In the present paper, a method was proposed to estimate RA with charged particle swarm optimization (cPSO), a population‐based algorithm. The criteria for updating the particles were modified, so that the swarm converged to the steady state, which spanned the set of feasible solutions. The performance of cPSO‐MCR was demonstrated on test functions, simulated datasets, and real‐world data. Good accordance of the cPSO‐MCR results with the analytical solutions (Borgen plots) was observed. cPSO‐MCR was also shown to be capable of estimating the strength of the constraints and of revealing RA in noisy data. As compared with analytical methods, cPSO‐MCR is simpler to implement, expands to more than three chemical compounds, is immune to noise, and can be easily adapted to virtually all types of constraints and objective functions (constraint based or residue based). cPSO‐MCR also provides natural visual information about the level of RA in spectra and concentration profiles, similar to the methods of two extreme solutions (e.g., MCR‐BANDS). Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

8.
Gastrodia elata from different geographical origins varies in quality and pharmacological activity. This study focused on the classification and identification of Gastrodia elata from six producing areas using high‐performance liquid chromatography fingerprint combined with boosting partial least‐squares discriminant analysis. Before recognition analysis, a principal component analysis was applied to ascertain the discrimination possibility with high‐performance liquid chromatography fingerprints. And then, boosting partial least‐squares discriminant analysis and conventional partial least‐squares discriminant analysis were applied in this study. Experimental results indicated that the adaptive iteratively reweighted penalized least‐squares algorithm could eliminate the baseline drift of high‐performance liquid chromatography chromatograms effectively. And compared with partial least‐squares discriminant analysis, the total recognition rates using high‐performance liquid chromatography fingerprint combined with boosting partial least‐squares discriminant analysis for the calibration sets and prediction sets were improved from 94 to 100% and 86 to 97%, respectively. In conclusion, high‐performance liquid chromatography combined with boosting partial least‐squares discriminant analysis, which has such advantages as effective, specific, accurate, non‐polluting, has an edge for discrimination of traditional Chinese medicine from different geographical origins. And the proposed methodology is a useful tool to classify and identify Gastrodia elata from different geographical origins.  相似文献   

9.
A fast chromatographic methodology is presented for the analysis of three synthetic dyes in non-alcoholic beverages: amaranth (E123), sunset yellow FCF (E110) and tartrazine (E102). Seven soft drinks (purchased from a local supermarket) were homogenized, filtered and injected into the chromatographic system. Second order data were obtained by a rapid LC separation and DAD detection. A comparative study of the performance of two second order algorithms (MCR-ALS and U-PLS/RBL) applied to model the data, is presented. Interestingly, the data present time shift between different chromatograms and cannot be conveniently corrected to determine the above-mentioned dyes in beverage samples. This fact originates the lack of trilinearity that cannot be conveniently pre-processed and can hardly be modelled by using U-PLS/RBL algorithm. On the contrary, MCR-ALS has shown to be an excellent tool for modelling this kind of data allowing to reach acceptable figures of merit. Recovery values ranged between 97% and 105% when analyzing artificial and real samples were indicative of the good performance of the method. In contrast with the complete separation, which consumes 10 mL of methanol and 3 mL of 0.08 mol L−1 ammonium acetate, the proposed fast chromatography method requires only 0.46 mL of methanol and 1.54 mL of 0.08 mol L−1 ammonium acetate. Consequently, analysis time could be reduced up to 14.2% of the necessary time to perform the complete separation allowing saving both solvents and time, which are related to a reduction of both the costs per analysis and environmental impact.  相似文献   

10.
The Partial least squares class model (PLSCM) was recently proposed for multivariate quality control based on a partial least squares (PLS) regression procedure. This paper presents a case study of quality control of peanut oils based on mid‐infrared (MIR) spectroscopy and class models, focusing mainly on the following aspects: (i) to explain the meanings of PLSCM components and make comparisons between PLSCM and soft independent modeling of class analogy (SIMCA); (ii) to correct the estimation of the original PLSCM confidence interval by considering a nonzero intercept term for center estimation; (iii) to investigate the potential of MIR spectroscopy combined with class models for identifying peanut oils with low doping concentrations of other edible oils. It is demonstrated that PLSCM is actually different from the ordinary PLS procedure, but it estimates the class center and class dispersion in the framework of a latent variable projection model. While SIMCA projects the original variables onto a few dimensions explaining most of the data variances, PLSCM components consider simultaneously the explained variances and the compactness of samples belonging to the same class. The analysis results indicate PLSCM is an intuitive and easy‐to‐use tool to tackle one‐class problems and has comparable performance with SIMCA. The advantages of PLSCM might be attributed to the great success and well‐established foundations of PLS. For PLSCM, the optimization of model complexity and estimation of decision region can be performed as in multivariate calibration routines. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

11.
The determination of rate constants for consecutive irreversible reactions is a difficult and time‐consuming problem, especially when the research extends up to many subsequent products. Thus, the derivation of proper mathematical expressions would greatly facilitate the determination of these rate constants when only the rate constant of the first consecutive reaction is known. Many authors have dealt with this problem in the past but the issue is still of interest to the scientific community judging from recent publications. This paper aims at extending our knowledge of mathematical expressions for rate constant ratios of consecutive reactions to more than three reactions, as is the situation now, and offering a simple graphical estimation of the rate constant ratios exploiting the maxima of each intermediate product. Furthermore, the method extends to the derivation of a generic formula for the estimation of the rate constant ratios based on this graphical approach. This approach for the estimation of rate constant ratios based on mathematical expressions and graphical estimations was validated against experimental data found in the literature.  相似文献   

12.
This paper presents a modified version of the NIPALS algorithm for PLS regression with one single response variable. This version, denoted a CF‐PLS, provides significant advantages over the standard PLS. First of all, it strongly reduces the over‐fit of the regression. Secondly, R2 for the null hypothesis follows a Beta distribution only function of the number of observations, which allows the use of a probabilistic framework to test the validity of a component. Thirdly, the models generated with CF‐PLS have comparable if not better prediction ability than the models fitted with NIPALS. Finally, the scores and loadings of the CF‐PLS are directly related to the R2, which makes the model and its interpretation more reliable. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

13.
Well‐established, linear multivariate calibration methods such as multivariate least‐squares regression (MLR), principal component regression (PCR), or partial least squares (PLS) have two limitations: (i) measured data must be linearly related to the response variables and (ii) predictor variables xn = 1, …, N cannot be coupled to each other. For evaluation of nonlinear data, however, these restrictions need to be overcome and thus polynomial multivariate least‐squares regression (PMLR or “response surfaces”) has been introduced here. PMLR is based on multivariate least squares but incorporates all combinations of predictor variables up to a user‐selected polynomial order (e.g., including u or v = 0). Because of the inclusion of such coupled terms and their powers, PMLR models are better adapted to model nonlinear data and can help to enhance the prediction step's accuracy and precision. PMLR has been based on MLR because it facilitates—unlike PCR or PLS—a physical and chemical interpretation of the predictors. Hence, the origins and the relevance of nonlinear and/or coupled predictors can be investigated. The details of the PMLR algorithm and its implementation are presented along with a method for model optimization utilizing gradients of response surfaces. Newly developed PMLR models up to quintic order have been applied to predict a chromatograph's peak resolution as a function of six‐instrument parameters. It has been demonstrated that PMLR is better capable than MLR and PCR to describe these nonlinear and coupled instrument parameters. In addition, the novel software tool has been utilized for model optimization to determine instrument parameters, which result in the best chromatographic resolution. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

14.
In developing partial least squares calibration models, selecting the number of latent variables used for their construction to minimize both model bias and model variance remains a challenge. Several metrics exist for incorporating these trade‐offs, but the cost of model parsimony and the potential for underfitting on achievable prediction errors are difficult to anticipate. We propose a metric that penalizes growing model variance against decreasing bias as additional latent variables are added. The magnitude of the penalty is scaled by a user‐defined parameter that is formulated to provide a constraint on the fractional increase in root mean square error of cross‐validation (RMSECV) when selecting a parsimonious model over the conventional minimum RMSECV solution. We evaluate this approach for quantification of four organic functional groups using 238 laboratory standards and 750 complex atmospheric organic aerosol mixtures with mid‐infrared spectroscopy. Parametric variation of this penalty demonstrates that increase in prediction errors due to underfitting is bounded by the magnitude of the penalty for samples similar to laboratory standards used for model training and validation. Imposing an ensemble of penalties corresponding to a 0–30% allowable increase in RMSECV through sum of ranking differences leads to the selection of a model that increases the actual RMSECV up to 20% for laboratory standards but achieves an 85% reduction in the mean error in predicted concentrations for environmental mixtures. Partial least squares models developed with laboratory mixtures can provide useful predictions in complex environmental samples, but may benefit from protection against overfitting. © 2015 The Authors. Journal of Chemometrics published by John Wiley & Sons Ltd.  相似文献   

15.
16.
The present study demonstrated the possibility of utilizing the ytterbium (Yb)‐based internal standard near‐infrared (NIR) spectroscopic measurement technique coupled with multivariate calibration for quantitative analysis of tea, including total free amino acids and total polyphenols in tea. Yb is a rare earth element aimed to compensate for the spectral variation induced by the alteration of sample quantity during the spectral measurement of the powdered samples. Boosting was invoked to be combined with least‐squares support vector regression (LS‐SVR), forming boosting least‐squares support vector regression (BLS‐SVR) for the multivariate calibration task. The results showed that the tea quality could be accurately and rapidly determined via the Yb‐based internal standard NIR spectroscopy combined with BLS‐SVR method. Moreover, the introduction of boosting drastically enhanced the performance of individual LS‐SVR, and BLS‐SVR compared favorably with partial least‐squares regression. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

17.
Comprehensive two‐dimensional gas chromatography and flame ionization detection combined with unfolded‐partial least squares is proposed as a simple, fast and reliable method to assess the quality of gasoline and to detect its potential adulterants. The data for the calibration set are first baseline corrected using a two‐dimensional asymmetric least squares algorithm. The number of significant partial least squares components to build the model is determined using the minimum value of root‐mean square error of leave‐one out cross validation, which was 4. In this regard, blends of gasoline with kerosene, white spirit and paint thinner as frequently used adulterants are used to make calibration samples. Appropriate statistical parameters of regression coefficient of 0.996–0.998, root‐mean square error of prediction of 0.005–0.010 and relative error of prediction of 1.54–3.82% for the calibration set show the reliability of the developed method. In addition, the developed method is externally validated with three samples in validation set (with a relative error of prediction below 10.0%). Finally, to test the applicability of the proposed strategy for the analysis of real samples, five real gasoline samples collected from gas stations are used for this purpose and the gasoline proportions were in range of 70–85%. Also, the relative standard deviations were below 8.5% for different samples in the prediction set.  相似文献   

18.
Isotropic and anisotropic magnetizabilities for noble gas atoms and a series of singlet and triplet molecules were calculated using the second‐order Douglas‐Kroll‐Hess (DKH2) Hamiltonian containing the vector potential A and in part using second‐order generalized unrestricted Møller‐Plesset (GUMP2) theory. The DKH2 Hamiltonian was resolved into three parts (spin‐free terms, spin‐dependent terms, and magnetic perturbation terms), and the magnetizabilities were decomposed into diamagnetic and paramagnetic terms to investigate the relativistic and electron‐correlation effects in detail. For Ne, Kr, and Xe, the calculated magnetizabilities approached the experimental values, once relativistic and electron‐correlation effects were included. For the IF molecule, the magnetizability was strongly affected by the spin‐orbit interaction, and the total relativistic contribution amounted to 22%. For group 17, 16, 15, and 14 hydrides, the calculated relativistic effects were small (less than 3%), and trends were observed in relativistic and electron‐correlation effects across groups and periods. The magnetizability anisotropies of triplet molecules were generally larger than those of similar singlet molecules. The so‐called relativistic‐correlation interference for the magnetizabilities computed using the relativistic GUMP2 method can be neglected for the molecules evaluated, with exception of triplet SbH. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009  相似文献   

19.
In this paper, fault detection and identification methods based on semi‐supervised Laplacian regularization kernel partial least squares (LRKPLS) are proposed. In Laplacian regularization learning framework, unlabeled and labeled samples are used to improve estimate of data manifold so that one can establish a more robust data model. We show that LRKPLS can avoid the over‐fitting problem which may be caused by sample insufficient and outliers present. Moreover, the proposed LRKPLS approach has no special restriction on data distribution, in other words, it can be used in the case of nonlinear or non‐Gaussian data. On the basis of LRKPLS, corresponding fault detection and identification methods are proposed. Those methods are used to monitor a numerical example and Hot Galvanizing Pickling Waste Liquor Treatment Process (HGPWLTP), and the cases study show effeteness of the proposed approaches. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

20.
In order to separate a high‐performance liquid chromatography with diode array detector (HPLC‐DAD) data set to chromatogram peaks and spectra for all compounds, a separation method based on the model of generalized Gaussian reference curve measurement (GGRCM) and the algorithm of multi‐target intermittent particle swarm optimization (MIPSO) is proposed in this paper. A parameter θ is constructed to generate a reference curve r(θ) for a chromatogram peak based on its physical principle. The GGRCM model is proposed to calculate the fitness ε(θ) for every θ, which indicates the possibility for the HPLC‐DAD data set to contain a chromatogram peak similar to the r(θ). The smaller the fitness is, the higher the possibility. The algorithm of MIPSO is then introduced to calculate the optimal parameters by minimizing the fitness mentioned earlier. Finally, chromatogram peaks are constructed based on these optimal parameters, and the spectra are calculated by an estimator. Through the simulations and experiments, the following conclusions are drawn: (i) the GGRCM‐MIPSO method can extract chromatogram peaks from simulation data set without knowing the number of the compounds in advance even when a severe overlap and white noise exist and (ii) the GGRCM‐MIPSO method can be applied to the real HPLC‐DAD data set. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号