首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
《Microchemical Journal》2008,88(2):119-127
An optimized model of multivariate classification for the monitoring of eighteen spring waters in the land of Serra St. Bruno, Calabria, Italy, has been developed. Thirty analytical parameters for each water source were investigated and reduced to eight by means of Principal Component Analysis (PCA). Water springs were grouped in five distinct classes by cluster techniques (CA) and a model for their classification was built by a Partial Least Squares–Discriminant Analysis (PLS–DA) procedure. The model was optimized and validated and then applied to new data matrices, containing the analytical parameters carried out on the same sources during the successive years. This model proved to be able to notice deviations of the global analytical characteristics, by pointing out in the course of time a different distribution of the samples within the classes. The variation of nitrate concentration was demonstrated to be the major responsible for the observed class shifts. The shifting sources were localized in areas used as sowable lands and high variability of nitrate content was ascribed to the practice of crop rotation, involving a varying use of the nitrogenous chemical fertilizers.  相似文献   

2.
In this study, chemometric techniques such as cluster analysis (CA), discriminant analysis (DA), principal component analysis (PCA) and partial least squares (PLS) were used to analyse the wastewater dataset to identify the factors which affect the composition of sewage of domestic origin, spatial and temporal variations, similarity/dissimilarity among the wastewater characteristics of cis- and trans-drains and discriminating variables. Samples collected from 24 wastewater drains in Lucknow city and from three sites on Gomti river in the month of January/February, May, August and November during the period of 5 years (1994-1999) were characterized for 32 parameters. The multivariate techniques successfully described the similarities/dissimilarities among the sewage drains on the basis of their wastewater characteristics and sources signifying the effect of routine domestic/commercial activities in respective drainage areas. Spatial and seasonal variations in wastewater composition were also determined successfully. CA generated six groups of drains on the basis of similar wastewater characteristic. PCA provided information on seasonal influence and compositional differences in sewage generated by domestic and industrial waste dominated drains and showed that drains influenced by mixed industrial effluents have high organic pollution load. DA rendered six variables (TDS, alkalinity, F, TKN, Cd and Cr) discriminating between cis- and trans-drains. PLS-DA showed dominance of Cd, Cr, NO3, PO4 and F in cis-drains wastewater. The results suggest that biological-process based STPs could treat wastewater both from the cis- as well as trans-drains, however, prior removal of toxic metals will be required from the cis-drains sewage. Further, seasonal variations in wastewater composition and pollution load could be the guiding factor for determining the STPs design parameters. The information generated would be useful in selection of process type and in designing of the proposed sewage treatment plants (STPs) for safe disposal of wastewater.  相似文献   

3.
The aim of this work was to obtain the correct classification of a set of two-dimensional polyacrylamide gel electrophoresis map images using the Zernike moments as discriminant variables. For each 2D-PAGE image, the Zernike moments were computed up to a maximum p order of 100. Partial least squares discriminant analysis with variable selection, based on a backward elimination algorithm, was applied to the moments calculated in order to select those that provided the lowest error in cross-validation. The new method was tested on four datasets: (1) samples belonging to neuroblastoma; (2) samples of human lymphoma; (3) samples from pancreatic cancer cells (two cell lines of control and drug-treated cancer cells); (4) samples from colon cancer cells (total lysates and nuclei treated or untreated with a histone deacetylase inhibitor). The results demonstrate that the Zernike moments can be successfully applied for fast classification purposes. The final aim is to build models that can be used to achieve rapid diagnosis of these illnesses.  相似文献   

4.
The use of near infrared (NIR) hyperspectral imaging and hyperspectral image analysis for distinguishing between hard, intermediate and soft maize kernels from inbred lines was evaluated. NIR hyperspectral images of two sets (12 and 24 kernels) of whole maize kernels were acquired using a Spectral Dimensions MatrixNIR camera with a spectral range of 960-1662 nm and a sisuChema SWIR (short wave infrared) hyperspectral pushbroom imaging system with a spectral range of 1000-2498 nm. Exploratory principal component analysis (PCA) was used on absorbance images to remove background, bad pixels and shading. On the cleaned images, PCA could be used effectively to find histological classes including glassy (hard) and floury (soft) endosperm. PCA illustrated a distinct difference between glassy and floury endosperm along principal component (PC) three on the MatrixNIR and PC two on the sisuChema with two distinguishable clusters. Subsequently partial least squares discriminant analysis (PLS-DA) was applied to build a classification model. The PLS-DA model from the MatrixNIR image (12 kernels) resulted in root mean square error of prediction (RMSEP) value of 0.18. This was repeated on the MatrixNIR image of the 24 kernels which resulted in RMSEP of 0.18. The sisuChema image yielded RMSEP value of 0.29. The reproducible results obtained with the different data sets indicate that the method proposed in this paper has a real potential for future classification uses.  相似文献   

5.
In this paper, multivariate calibration of complicated process fluorescence data is presented. Two data sets related to the production of white sugar are investigated. The first data set comprises 106 observations and 571 spectral variables, and the second data set 268 observations and 3997 spectral variables. In both applications, a single response, ash content, is modelled and predicted as a function of the spectral variables. Both data sets contain certain features making multivariate calibration efforts non-trivial. The objective is to show how principal component analysis (PCA) and partial least squares (PLS) regression can be used to overview the data sets and to establish predictively sound regression models. It is shown how a recently developed technique for signal filtering, orthogonal signal correction (OSC), can be applied in multivariate calibration to enhance predictive power. In addition, signal compression is tested on the larger data set using wavelet analysis. It is demonstrated that a compression down to 4% of the original matrix size — in the variable direction — is possible without loss of predictive power. It is concluded that the combination of OSC for pre-processing and wavelet analysis for compression of spectral data is promising for future use.  相似文献   

6.
Using a series of thirteen organic materials that includes novel high-nitrogen energetic materials, conventional organic military explosives, and benign organic materials, we have demonstrated the importance of variable selection for maximizing residue discrimination with partial least squares discriminant analysis (PLS-DA). We built several PLS-DA models using different variable sets based on laser induced breakdown spectroscopy (LIBS) spectra of the organic residues on an aluminum substrate under an argon atmosphere. The model classification results for each sample are presented and the influence of the variables on these results is discussed. We found that using the whole spectra as the data input for the PLS-DA model gave the best results. However, variables due to the surrounding atmosphere and the substrate contribute to discrimination when the whole spectra are used, indicating this may not be the most robust model. Further iterative testing with additional validation data sets is necessary to determine the most robust model.  相似文献   

7.
This work describes a novel experimental design aimed at building a calibration set constituted by samples containing a different number of components. The algorithm performs a reiteration process to maintain the number of samples at the lower value as possible and to ensure an homogeneous presence of all the concentration levels. The mixture design was applied to a drug system composed by one-to-four components in different combination. The resolution of the system was performed by three multivariate UV spectrophotometric methods utilizing principal component regression (PCR) and partial last squares (PLS1 and PLS2) algorithms. The calibration set was composed by 61 references on four concentration levels, including 15 samples for each quaternary, ternary and binary composition and 16 one-component samples. The calibration models were optimized through a careful selection of number of factors and wavelength zones, in such a way as to remove interferences from instrumental noise and excipients present in the pharmaceutical formulations. The prediction power of the regression models were verified and compared by analysis of an external prediction set. The models were finally used to assay pharmaceutical specialities containing the studied drugs in one-to-four formulations.  相似文献   

8.
Most multivariate calibration methods require selection of tuning parameters, such as partial least squares (PLS) or the Tikhonov regularization variant ridge regression (RR). Tuning parameter values determine the direction and magnitude of respective model vectors thereby setting the resultant predication abilities of the model vectors. Simultaneously, tuning parameter values establish the corresponding bias/variance and the underlying selectivity/sensitivity tradeoffs. Selection of the final tuning parameter is often accomplished through some form of cross-validation and the resultant root mean square error of cross-validation (RMSECV) values are evaluated. However, selection of a “good” tuning parameter with this one model evaluation merit is almost impossible. Including additional model merits assists tuning parameter selection to provide better balanced models as well as allowing for a reasonable comparison between calibration methods. Using multiple merits requires decisions to be made on how to combine and weight the merits into an information criterion. An abundance of options are possible. Presented in this paper is the sum of ranking differences (SRD) to ensemble a collection of model evaluation merits varying across tuning parameters. It is shown that the SRD consensus ranking of model tuning parameters allows automatic selection of the final model, or a collection of models if so desired. Essentially, the user’s preference for the degree of balance between bias and variance ultimately decides the merits used in SRD and hence, the tuning parameter values ranked lowest by SRD for automatic selection. The SRD process is also shown to allow simultaneous comparison of different calibration methods for a particular data set in conjunction with tuning parameter selection. Because SRD evaluates consistency across multiple merits, decisions on how to combine and weight merits are avoided. To demonstrate the utility of SRD, a near infrared spectral data set and a quantitative structure activity relationship (QSAR) data set are evaluated using PLS and RR.  相似文献   

9.
Variable scaling alters the covariance structure of data, affecting the outcome of multivariate analysis and calibration. Here we present a new method, variable stability (VAST) scaling, which weights each variable according to a metric of its stability. The beneficial effect of VAST scaling is demonstrated for a data set of 1H NMR spectra of urine acquired as part of a metabonomic study into the effects of unilateral nephrectomy in an animal model. The application of VAST scaling improved the class distinction and predictive power of partial least squares discriminant analysis (PLS-DA) models. The effects of other data scaling and pre-processing methods, such as orthogonal signal correction (OSC), were also tested. VAST scaling produced the most robust models in terms of class prediction, outperforming OSC in this aspect. As a result the subtle, but consistent, metabolic perturbation caused by unilateral nephrectomy could be accurately characterised despite the presence of much greater biological differences caused by normal physiological variation. VAST scaling presents itself as an interpretable, robust and easily implemented data treatment for the enhancement of multivariate data analysis.  相似文献   

10.
This paper describes a new procedure for the determination of quinolones ciprofloxacin and sarafloxacin in chicken muscle samples. It is based on a previously developed capillary zone electrophoresis (CZE) separation, in which all the quinolones regulated by EU Council Regulation number 2377/90 could be separated. However, as ciprofloxacin and sarafloxacin coelute in the CZE run and they have strongly overlapped spectra, separation between them is not possible.To overcome this problem, we have used a multivariate calibration procedure (partial least square regression (PLS-2)), applied to the spectra obtained at the maximum of the electrophoretic peaks, by using a diode array detector. The method has been validated by a combination of pure standards and fortified blank chicken muscle extracts. The recoveries obtained in the validation set were 101±6 and 93±6% for sarafloxacin and ciprofloxacin, respectively. The method has been also applied to chicken muscle samples, fortified at concentration levels between 100 and 350 μg kg, corresponding to values near the maximum residue level (MRL) regulated by the European Community.  相似文献   

11.
Variable (wavelength or feature) selection techniques have become a critical step for the analysis of datasets with high number of variables and relatively few samples. In this study, a novel variable selection strategy, variable combination population analysis (VCPA), was proposed. This strategy consists of two crucial procedures. First, the exponentially decreasing function (EDF), which is the simple and effective principle of ‘survival of the fittest’ from Darwin’s natural evolution theory, is employed to determine the number of variables to keep and continuously shrink the variable space. Second, in each EDF run, binary matrix sampling (BMS) strategy that gives each variable the same chance to be selected and generates different variable combinations, is used to produce a population of subsets to construct a population of sub-models. Then, model population analysis (MPA) is employed to find the variable subsets with the lower root mean squares error of cross validation (RMSECV). The frequency of each variable appearing in the best 10% sub-models is computed. The higher the frequency is, the more important the variable is. The performance of the proposed procedure was investigated using three real NIR datasets. The results indicate that VCPA is a good variable selection strategy when compared with four high performing variable selection methods: genetic algorithm–partial least squares (GA–PLS), Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS), competitive adaptive reweighted sampling (CARS) and iteratively retains informative variables (IRIV). The MATLAB source code of VCPA is available for academic research on the website: http://www.mathworks.com/matlabcentral/fileexchange/authors/498750.  相似文献   

12.
O. Divya 《Talanta》2007,72(1):43-48
Synchronous fluorescence spectroscopy (SFS) is a rapid, sensitive and nondestructive method suitable for the analysis of multifluorophoric mixtures. The present study demonstrates the use of SFS and multivariate methods for the analysis of petroleum products which is a complex mixture of multiple fluorophores. Two multivariate techniques principal component regression (PCR) and partial least square regression (PLSR) have been successfully applied for the classification of petrol-kerosene mixtures. Calibration models were constructed using 35 samples and their validation was carried out with varying composition of petrol and kerosene in the calibration range. The results showed that the method could be used for the estimation of kerosene in kerosene-mixed petrol. The model was found to be sensitive, detecting even 1% contamination of kerosene in petrol.  相似文献   

13.
We present a novel algorithm for linear multivariate calibration that can generate good prediction results. This is accomplished by the idea of that testing samples are mixed by the calibration samples in proper proportion. The algorithm is based on the mixed model of samples and is therefore called MMS algorithm. With both theoretical support and analysis of two data sets, it is demonstrated that MMS algorithm produces lower prediction errors than partial least squares (PLS2) model, has similar prediction performance to PLS1. In the anti-interference test of background, MMS algorithm performs better than PLS2. At the condition of the lack of some component information, MMS algorithm shows better robustness than PLS2.  相似文献   

14.
Halide and thiocyanate ions can be determined by a precipitation titration with silver nitrate as the titrant, and the end-point can be evaluated by a potentiometric method, in which generally a silver indicator electrode is used as the indicator electrode and a double-junction Ag–AgCl electrode as the reference electrode. However, when mixtures of halide and thiocyanate are titrated, it is difficult to determine these components individually for there are overlapping steps in the potentiometric titration curves, especially in the case that there are obvious differences between concentrations of the components. In this paper, the linear equation for the potentiometric precipitation titration of a mixture of halide and thiocyanate ions was developed and it was then used for determining the components in the mixtures simultaneously with the aid of multivariate calibration methods. By application of this model, 27 synthetic mixtures with three- and four-component combinations of chloride, bromide, iodide and thiocyanate with low concentration levels from 1.8×10−4 to 6.2×10−4 mol l−1 were analyzed and acceptable results were obtained.  相似文献   

15.
Quantitative analysis with laser-induced breakdown spectroscopy traditionally employs calibration curves that are complicated by chemical matrix effects. These chemical matrix effects influence the laser-induced breakdown spectroscopy plasma and the ratio of elemental composition to elemental emission line intensity. Consequently, laser-induced breakdown spectroscopy calibration typically requires a priori knowledge of the unknown, in order for a series of calibration standards similar to the unknown to be employed. In this paper, three new Multivariate Analysis techniques are employed to analyze the laser-induced breakdown spectroscopy spectra of 18 disparate igneous and highly-metamorphosed rock samples. Partial Least Squares analysis is used to generate a calibration model from which unknown samples can be analyzed. Principal Components Analysis and Soft Independent Modeling of Class Analogy are employed to generate a model and predict the rock type of the samples. These Multivariate Analysis techniques appear to exploit the matrix effects associated with the chemistries of these 18 samples.  相似文献   

16.
Two spectrophotometric methods for the determination of Ethinylestradiol (ETE) and Levonorgestrel (LEV) by using the multivariate calibration technique of partial least square (PLS) and principal component regression (PCR) are presented. In this study the PLS and PCR are successfully applied to quantify both hormones using the information contained in the absorption spectra of appropriate solutions. In order to do this, a calibration set of standard samples composed of different mixtures of both compounds has been designed. The results found by application of the PLS and PCR methods to the simultaneous determination of mixtures, containing 4–11 μg ml−1 of ETE and 2–23 μg ml−1 of LEV, are reported. Five different oral contraceptives were analyzed and the results were very similar to that obtained by a reference liquid Chromatographic method.  相似文献   

17.
Electron paramagnetic resonance (EPR) spectroscopy is a powerful technique that is able to characterize radicals formed in kinetic reactions. However, spectral characterization of individual chemical species is often limited or even unmanageable due to the severe kinetic and spectral overlap among species in kinetic processes. Therefore, we applied, for the first time, multivariate curve resolution-alternating least squares (MCR-ALS) method to EPR time evolving data sets to model and characterize the different constituents in a kinetic reaction. Here we demonstrate the advantage of multivariate analysis in the investigation of radicals formed along the kinetic process of hydroxycoumarin in alkaline medium. Multiset analysis of several EPR-monitored kinetic experiments performed in different conditions revealed the individual paramagnetic centres as well as their kinetic profiles. The results obtained by MCR-ALS method demonstrate its prominent potential in analysis of EPR time evolved spectra.  相似文献   

18.
This paper describes an approach for the colour-based classification of RGB images, taken with a common digital CCD camera on inhomogeneous food matrices. The aim was that of elaborating a feature selection/classification method independent of the specific food matrix that is analysed, in the sense that the variables that are the most relevant ones for the classification of the analysed samples are selected in a blind way, with no a priori assumptions on the basis of the nature of the considered food matrix. A one-dimensional signal describing the colour content of each acquired digital image, which we have called colourgram, is created as the contiguous sequence of the frequency distribution curves of the three red, green and blue colours values, of related parameters (also including hue, saturation and intensity) and of the scores values deriving from the PCA analysis of the unfolded 3D image array, together with the corresponding loadings values and eigenvalues. Once a sufficient number of digital images has been acquired, the corresponding colourgrams are then analysed by means of a feature selection/classification algorithm based on the wavelet transform, wavelet packet transform for efficient pattern recognition (WPTER). This approach was tested on a series of samples of “pesto”, a typical Italian vegetable pasta sauce, which presents high colour variability, mainly due to technological variables (raw materials, processes) and to the degradation of chlorophylls during storage. Good classification results (100% of correctly classified objects with very parsimonious models) have been obtained, also in comparison with the visual evaluation results of a panel test.  相似文献   

19.
The partial least squares regression method has been applied for simultaneous spectrophotometric determination of harmine, harmane, harmalol and harmaline in Peganum harmala L. (Zygophyllaceae) seeds. The effect of pH was optimized employing multivariate definition of selectivity and sensitivity and best results were obtained in basic media (pH > 9). The calibration models were optimized for number of latent variables by the cross-validation procedure. Determinations were made over the concentration range of 0.15-10 μg mL−1. The proposed method was validated by applying it to the analysis of the β-carbolines in synthetic quaternary mixtures of media at pH 9 and 11. The relative standard errors of prediction were less than 4% in most cases. Analysis of P. harmala seeds by the proposed models for contents of the β-carboline derivatives resulted in 1.84%, 0.16%, 0.25% and 3.90% for harmine, harmane, harmaline and harmalol, respectively. The results were validated against an existing HPLC method and it no significant differences were observed between the results of two methods.  相似文献   

20.
In this work, two different maximum likelihood approaches for multivariate curve resolution based on maximum likelihood principal component analysis (MLPCA) and on weighted alternating least squares (WALS) are compared with the standard multivariate curve resolution alternating least squares (MCR‐ALS) method. To illustrate this comparison, three different experimental data sets are used: the first one is an environmental aerosol source apportionment; the second is a time‐course DNA microarray, and the third one is an ultrafast absorption spectroscopy. Error structures of the first two data sets were heteroscedastic and uncorrelated, and the difference between them was in the existence of missing values in the second case. In the third data set about ultrafast spectroscopy, error correlation between the values at different wavelengths is present. The obtained results confirmed that the resolved component profiles obtained by MLPCA‐MCR‐ALS are practically identical to those obtained by MCR‐WALS and that they can differ from those resolved by ordinary MCR‐ALS, especially in the case of high noise. It is shown that methods that incorporate uncertainty estimations (such as MLPCA‐ALS and MCR‐WALS) can provide more reliable results and better estimated parameters than unweighted approaches (such as MCR‐ALS) in the case of the presence of high amounts of noise. The possible advantage of using MLPCA‐MCR‐ALS over MCR‐WALS is then that the former does not require changing the traditional MCR‐ALS algorithm because MLPCA is only used as a preliminary data pretreatment before MCR analysis. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号