首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In multivariate calibration with the spectral dataset, variable selection is often applied to identify relevant subset of variables, leading to improved prediction accuracy and easy interpretation of the selected fingerprint regions. Until now, numerous variable selection methods have been proposed, but a proper choice among them is not trivial. Furthermore, in many cases, a set of variables found by those methods might not be robust due to the irreproducibility and uncertainty issues, posing a great challenge in improving the reliability of the variable selection. In this study, the reproducibility of the 5 variable selection methods was investigated quantitatively for evaluating their performance. The reproducibility of variable selection was quantified by using Monte-Carlo sub-sampling (MCS) techniques together with the quantitative similarity measure designed for the highly collinear spectral dataset. The investigation of reproducibility and prediction accuracy of the several variable selection algorithms with two different near-infrared (NIR) datasets illustrated that the different variable selection methods exhibited wide variability in their performance, especially in their capabilities to identify the consistent subset of variables from the spectral datasets. Thus the thorough assessment of the reproducibility together with the predictive accuracy of the identified variables improved the statistical validity and confidence of the selection outcome, which cannot be addressed by the conventional evaluation schemes.  相似文献   

2.
We present a novel algorithm for linear multivariate calibration that can generate good prediction results. This is accomplished by the idea of that testing samples are mixed by the calibration samples in proper proportion. The algorithm is based on the mixed model of samples and is therefore called MMS algorithm. With both theoretical support and analysis of two data sets, it is demonstrated that MMS algorithm produces lower prediction errors than partial least squares (PLS2) model, has similar prediction performance to PLS1. In the anti-interference test of background, MMS algorithm performs better than PLS2. At the condition of the lack of some component information, MMS algorithm shows better robustness than PLS2.  相似文献   

3.
Halide and thiocyanate ions can be determined by a precipitation titration with silver nitrate as the titrant, and the end-point can be evaluated by a potentiometric method, in which generally a silver indicator electrode is used as the indicator electrode and a double-junction Ag–AgCl electrode as the reference electrode. However, when mixtures of halide and thiocyanate are titrated, it is difficult to determine these components individually for there are overlapping steps in the potentiometric titration curves, especially in the case that there are obvious differences between concentrations of the components. In this paper, the linear equation for the potentiometric precipitation titration of a mixture of halide and thiocyanate ions was developed and it was then used for determining the components in the mixtures simultaneously with the aid of multivariate calibration methods. By application of this model, 27 synthetic mixtures with three- and four-component combinations of chloride, bromide, iodide and thiocyanate with low concentration levels from 1.8×10−4 to 6.2×10−4 mol l−1 were analyzed and acceptable results were obtained.  相似文献   

4.
This work describes a novel experimental design aimed at building a calibration set constituted by samples containing a different number of components. The algorithm performs a reiteration process to maintain the number of samples at the lower value as possible and to ensure an homogeneous presence of all the concentration levels. The mixture design was applied to a drug system composed by one-to-four components in different combination. The resolution of the system was performed by three multivariate UV spectrophotometric methods utilizing principal component regression (PCR) and partial last squares (PLS1 and PLS2) algorithms. The calibration set was composed by 61 references on four concentration levels, including 15 samples for each quaternary, ternary and binary composition and 16 one-component samples. The calibration models were optimized through a careful selection of number of factors and wavelength zones, in such a way as to remove interferences from instrumental noise and excipients present in the pharmaceutical formulations. The prediction power of the regression models were verified and compared by analysis of an external prediction set. The models were finally used to assay pharmaceutical specialities containing the studied drugs in one-to-four formulations.  相似文献   

5.
Multivariate calibration problems often involve the identification of a meaningful subset of variables, from a vast number of variables for better prediction of output variables. A new graph theoretic method based on partial correlations (variable interaction network—VIN) is proposed. Many well studied representative calibration datasets spanning different application domains are selected for investigating the performance. Partial least squares (PLS) regression models combined with variable selection techniques are employed for benchmarking the performance. Subsets of variables with different number of variables are retained for the final analysis after VIN selection and progressive prediction accuracies are used for comparison. VIN-PLS results show significant improvement in prediction efficiencies and variable subset optimization. Improvement of up to 45% over existing methods with significantly fewer variables is achieved using the new method. Advantages of VIN based variable selection are highlighted.  相似文献   

6.
The partial least squares regression method has been applied for simultaneous spectrophotometric determination of harmine, harmane, harmalol and harmaline in Peganum harmala L. (Zygophyllaceae) seeds. The effect of pH was optimized employing multivariate definition of selectivity and sensitivity and best results were obtained in basic media (pH > 9). The calibration models were optimized for number of latent variables by the cross-validation procedure. Determinations were made over the concentration range of 0.15-10 μg mL−1. The proposed method was validated by applying it to the analysis of the β-carbolines in synthetic quaternary mixtures of media at pH 9 and 11. The relative standard errors of prediction were less than 4% in most cases. Analysis of P. harmala seeds by the proposed models for contents of the β-carboline derivatives resulted in 1.84%, 0.16%, 0.25% and 3.90% for harmine, harmane, harmaline and harmalol, respectively. The results were validated against an existing HPLC method and it no significant differences were observed between the results of two methods.  相似文献   

7.
Near-infrared (NIR) spectroscopy in conjunction with chemometric techniques allows on-line monitoring in real time, which can be of considerable use in industry. If it is to be correctly used in industrial applications, generally some basic considerations need to be taken into account, although this does not always apply. This study discusses some of the considerations that would help evaluate the possibility of applying multivariate calibration in combination with NIR to properties of industrial interest. Examples of these considerations are whether there is a relation between the NIR spectrum and the property of interest, what the calibration constraints are and how a sample-specific error of prediction can be quantified. Various strategies for maintaining a multivariate model after it has been installed are also presented and discussed.  相似文献   

8.
A novel alternative for the simultaneous determination of compounds with similar structure is described, using the whole chemiluminescence-time profiles, acquired by the stopped-flow technique, in combination with mathematical treatments of multivariate calibration. The proposed method is based on the chemiluminescent oxidation of morphine and naloxone by their reaction with potassium permanganate in an acidic medium, using formaldehyde as co-factor. The whole chemiluminescence-time profiles, acquired using the stopped-flow technique in a continuous-flow system, allowed the use of the time-resolved chemiluminescence (CL) data in combination with multivariate calibration techniques, as partial least squares (PLS), for the quantitative determination of both opiate narcotics in binary mixtures.In order to achieve overcoat the additivity of the CL profiles and beside to obtain CL profiles for each drug the most separated as possible in the time, the optimum chemical conditions for the CL emission were investigated. The effect of common emission enhancers on the CL emission obtained in the oxidation reaction of these compounds in different acidic media was studied. The parameters selected were sulphuric acid 1.0 mol L−1, permanganate 0.2 mmol L−1 and formaldehyde 0.8 mol L−1. A calibration set of standard samples was designed by combination of a factorial design, with three levels for each factor and a central composite design. Finally, with the aim of validating the chemometric proposed method, a prediction set of binary samples was prepared. Using the multivariate calibration method proposed, the analytes were determined in synthetic samples, obtaining recoveries of 97-109%.  相似文献   

9.
Most multivariate calibration methods require selection of tuning parameters, such as partial least squares (PLS) or the Tikhonov regularization variant ridge regression (RR). Tuning parameter values determine the direction and magnitude of respective model vectors thereby setting the resultant predication abilities of the model vectors. Simultaneously, tuning parameter values establish the corresponding bias/variance and the underlying selectivity/sensitivity tradeoffs. Selection of the final tuning parameter is often accomplished through some form of cross-validation and the resultant root mean square error of cross-validation (RMSECV) values are evaluated. However, selection of a “good” tuning parameter with this one model evaluation merit is almost impossible. Including additional model merits assists tuning parameter selection to provide better balanced models as well as allowing for a reasonable comparison between calibration methods. Using multiple merits requires decisions to be made on how to combine and weight the merits into an information criterion. An abundance of options are possible. Presented in this paper is the sum of ranking differences (SRD) to ensemble a collection of model evaluation merits varying across tuning parameters. It is shown that the SRD consensus ranking of model tuning parameters allows automatic selection of the final model, or a collection of models if so desired. Essentially, the user’s preference for the degree of balance between bias and variance ultimately decides the merits used in SRD and hence, the tuning parameter values ranked lowest by SRD for automatic selection. The SRD process is also shown to allow simultaneous comparison of different calibration methods for a particular data set in conjunction with tuning parameter selection. Because SRD evaluates consistency across multiple merits, decisions on how to combine and weight merits are avoided. To demonstrate the utility of SRD, a near infrared spectral data set and a quantitative structure activity relationship (QSAR) data set are evaluated using PLS and RR.  相似文献   

10.
Two spectrophotometric methods for the determination of Ethinylestradiol (ETE) and Levonorgestrel (LEV) by using the multivariate calibration technique of partial least square (PLS) and principal component regression (PCR) are presented. In this study the PLS and PCR are successfully applied to quantify both hormones using the information contained in the absorption spectra of appropriate solutions. In order to do this, a calibration set of standard samples composed of different mixtures of both compounds has been designed. The results found by application of the PLS and PCR methods to the simultaneous determination of mixtures, containing 4–11 μg ml−1 of ETE and 2–23 μg ml−1 of LEV, are reported. Five different oral contraceptives were analyzed and the results were very similar to that obtained by a reference liquid Chromatographic method.  相似文献   

11.
A new cut-off criterion has been proposed for the selection of uninformative variables prior to chemometric partial least squares (PLS) modelling. After variable elimination, PLS regressions were made and assessed comparing the results with those obtained by PLS models based on the full spectral range. To assess the prediction capabilities, uninformative variable elimination (UVE)-PLS and PLS were applied to diffuse reflectance near-infrared spectra of heroin samples. The application of the proposed new cut-off criterion, based on the t-Students distribution, provided similar predictive capabilities of the PLS models than those obtained using the original criteria based on quantile value. However, the repeatability of the number of selected variables was improved significantly.  相似文献   

12.
This paper describes a new procedure for the determination of quinolones ciprofloxacin and sarafloxacin in chicken muscle samples. It is based on a previously developed capillary zone electrophoresis (CZE) separation, in which all the quinolones regulated by EU Council Regulation number 2377/90 could be separated. However, as ciprofloxacin and sarafloxacin coelute in the CZE run and they have strongly overlapped spectra, separation between them is not possible.To overcome this problem, we have used a multivariate calibration procedure (partial least square regression (PLS-2)), applied to the spectra obtained at the maximum of the electrophoretic peaks, by using a diode array detector. The method has been validated by a combination of pure standards and fortified blank chicken muscle extracts. The recoveries obtained in the validation set were 101±6 and 93±6% for sarafloxacin and ciprofloxacin, respectively. The method has been also applied to chicken muscle samples, fortified at concentration levels between 100 and 350 μg kg, corresponding to values near the maximum residue level (MRL) regulated by the European Community.  相似文献   

13.
Variable (wavelength or feature) selection techniques have become a critical step for the analysis of datasets with high number of variables and relatively few samples. In this study, a novel variable selection strategy, variable combination population analysis (VCPA), was proposed. This strategy consists of two crucial procedures. First, the exponentially decreasing function (EDF), which is the simple and effective principle of ‘survival of the fittest’ from Darwin’s natural evolution theory, is employed to determine the number of variables to keep and continuously shrink the variable space. Second, in each EDF run, binary matrix sampling (BMS) strategy that gives each variable the same chance to be selected and generates different variable combinations, is used to produce a population of subsets to construct a population of sub-models. Then, model population analysis (MPA) is employed to find the variable subsets with the lower root mean squares error of cross validation (RMSECV). The frequency of each variable appearing in the best 10% sub-models is computed. The higher the frequency is, the more important the variable is. The performance of the proposed procedure was investigated using three real NIR datasets. The results indicate that VCPA is a good variable selection strategy when compared with four high performing variable selection methods: genetic algorithm–partial least squares (GA–PLS), Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS), competitive adaptive reweighted sampling (CARS) and iteratively retains informative variables (IRIV). The MATLAB source code of VCPA is available for academic research on the website: http://www.mathworks.com/matlabcentral/fileexchange/authors/498750.  相似文献   

14.
Flow injection analysis (FIA) with multiwavelength scanning of the FIA peaks using a diode array detector (DAD) has been combined with a multivariate calibration approach applying the partial least squares (PLS) method for the data evaluation. In this way, various side effects like dilution of the reagent, high blank, absorbance changes due to the pH gradient throughout the peak and/or the other interferences can be accounted for. Thus, even with a simple FIA manifold instrumentation the satisfactory results of multicomponent analysis are obtained. The method described has been checked on analysis of binary (Ca and Mg) and ternary (Ca, Mg and Cu) mixtures with pyridylazo resorcinol (PAR) as reagent and applied for rapid determination of calcium and magnesium in dialysis liquids and waters.  相似文献   

15.
This study attempted the feasibility to use near infrared (NIR) spectroscopy as a rapid analysis method to qualitative and quantitative assessment of the tea quality. NIR spectroscopy with soft independent modeling of class analogy (SIMCA) method was proposed to identify rapidly tea varieties in this paper. In the experiment, four tea varieties from Longjing, Biluochun, Qihong and Tieguanyin were studied. The better results were achieved following as: the identification rate equals to 90% only for Longjing in training set; 80% only for Biluochun in test set; while, the remaining equal to 100%. A partial least squares (PLS) algorithm is used to predict the content of caffeine and total polyphenols in tea. The models are calibrated by cross-validation and the best number of PLS factors was achieved according to the lowest root mean square error of cross-validation (RMSECV). The correlation coefficients and the root mean square error of prediction (RMSEP) in the test set were used as the evaluation parameters for the models as follows: R = 0.9688, RMSEP = 0.0836% for the caffeine; R = 0.9299, RMSEP = 1.1138% for total polyphenols. The overall results demonstrate that NIR spectroscopy with multivariate calibration could be successfully applied as a rapid method not only to identify the tea varieties but also to determine simultaneously some chemical compositions contents in tea.  相似文献   

16.
A robustness study for a sensitive-stacking capillary electrophoresis method based on “acetonitrile-stacking” was carried out. Ten variables (pH, acetonitrile and triethanolamine in the buffer, injection time, injection pressure, acetonitrile and NaCl in the sample, capillary and tray temperature and separation voltage), whose levels were varied by 10% around the nominal level, were examined by a Plackett–Burman design (two-level design). The effects on corrected peak area and resolution (responses) were calculated and interpreted using three statistical approaches: dummy variables, distribution effects (Dong’s algorithm) and calibration curve. Dong’s method was found to be the most suitable to evaluate the robustness, since it considers qualitative (resolution) and quantitative (corrected peak area) responses and does not need a minimum number of dummy variables in the experimental design. From these studies, we can deduce that the first four variables were significant at 10% around the nominal level, and therefore a new design was made with those four variables at 5% nominal level. Then, only two variables proved to be significant for the resolution between some peaks, so the system suitability test limits were defined for these resolutions.  相似文献   

17.
This work concerns the validation of a previously described multivariate method for determining chlorophylls and their corresponding pheopigments. The meaning of the term validation is discussed, and the work is divided into two parts, concerning model validation and method validation. The model validation showed that 40 standards are sufficient to ensure that the Y-domain is adequately spanned, and that differentiation of the data improves the models. The wavelength range was restricted to 510–770 nm, thus, eliminating interfering signals from carotenoids that had not been included in the calibration solutions. This restriction does not affect the predictive ability towards any analytes except pheophytin a. For accurate predictions of pheophytin a the wavelength region between 350 and 415 nm was included in the model. All model evaluations were based on partial least squares regression for one y-variable (PLS1). A criterion used to quantify the performance of the model was the deviation, which is an estimate of variance calculated for predictions of samples, taking into account the model’s predictive ability, the leverage and the x-residuals. In the method validation section, predictions of samples by the proposed method are compared with results obtained using an HPLC reference method. It was found for chlorophyll a that the root mean square error of cross validation (RMSECV) calculated from the model was several times higher than the corresponding root mean square error of prediction (RMSEP) calculated from the HPLC analysis. A likely explanation for this is that the RMSECV is determined in the presence of severely interfering compounds, a desired consequence of spanning the Y-space. Samples were extracted (then measured and predicted) from algal cultures, representing six different taxonomic divisions of phytoplankton. The pigment composition of these species is known, so the analyst knows in advance which chlorophylls are present. Predictions by the models are consistent with a priori knowledge of the pigment composition. To evaluate the potential of these models to deal with data recorded by different instruments, the absorption spectra for a set of samples were registered with two instruments. The results show that there is a minor and negligible bias between the predictions obtained using these instruments, probably due to a slight shift in the wavelengths recorded by them.  相似文献   

18.
A voltammetric method is proposed for the simultaneous determination of tryptophan, cysteine, and tyrosine using multivariate calibration techniques. Various electrodes and voltammetric techniques were explored to ascertain the optimum measurement strategy. Among them, differential pulse voltammetry (DPV) with a Pt electrode was selected as analytical technique since it provided a suitable compromise between sensitivity and reproducibility while allowing the oxidation peaks of the three compounds to be reasonably discriminated. The sensitivity of DPV with Pt electrode for Trp standards was 8.4×10−2 A l mol−1, the repeatability 3.7% and the detection limit below 10−7 M. The lack of full selectivity of the voltammetric data was overcome using multivariate calibration methods on the basis of the differences in the voltammetric waves of each compound. The accuracy of predictions was evaluated preliminarily from the analysis of three-component synthetic mixtures. Subsequently, this method was applied to the analysis of oxidizable amino acids in feed samples. Results obtained were in good concordance with those given by the standard method using an amino acid analyzer.  相似文献   

19.
In this paper, multivariate calibration of complicated process fluorescence data is presented. Two data sets related to the production of white sugar are investigated. The first data set comprises 106 observations and 571 spectral variables, and the second data set 268 observations and 3997 spectral variables. In both applications, a single response, ash content, is modelled and predicted as a function of the spectral variables. Both data sets contain certain features making multivariate calibration efforts non-trivial. The objective is to show how principal component analysis (PCA) and partial least squares (PLS) regression can be used to overview the data sets and to establish predictively sound regression models. It is shown how a recently developed technique for signal filtering, orthogonal signal correction (OSC), can be applied in multivariate calibration to enhance predictive power. In addition, signal compression is tested on the larger data set using wavelet analysis. It is demonstrated that a compression down to 4% of the original matrix size — in the variable direction — is possible without loss of predictive power. It is concluded that the combination of OSC for pre-processing and wavelet analysis for compression of spectral data is promising for future use.  相似文献   

20.
The determination of the contents of therapeutic drugs, metabolites and other important biomedical analytes in biological samples is usually performed by using high-performance liquid chromatography (HPLC). Modern multivariate calibration methods constitute an attractive alternative, even when they are applied to intrinsically unselective spectroscopic or electrochemical signals. First-order (i.e., vectorized) data are conveniently analyzed with classical chemometric tools such as partial least-squares (PLS). Certain analytical problems require more sophisticated models, such as artificial neural networks (ANNs), which are especially able to cope with non-linearities in the data structure. Finally, models based on the acquisition and processing of second- or higher-order data (i.e., matrices or higher dimensional data arrays) present the phenomenon known as “second-order advantage”, which permits quantitation of calibrated analytes in the presence of interferents. The latter models show immense potentialities in the field of biomedical analysis. Pertinent literature examples are reviewed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号