首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Median absolute deviation (MAD) is a well‐established statistical method for determining outliers. This simple statistic can be used to determine the number of principal factors responsible for a data matrix by direct application to the residual standard deviation (RSD) obtained from principal component analysis (PCA). Unlike many other popular methods the proposed method, called determination of rank by MAD (DRMAD), does not involve the use of pseudo degrees of freedom, pseudo F‐tests, extensive calibration tables, time‐consuming iterations, nor empirical procedures. The method does not require strict adherence to normal distributions of experimental uncertainties. The computations are direct, simple to use and extremely fast, ideally suitable for online data processing. The results obtained using various sets of chemical data previously reported in the chemical literature agree with the early work. Limitations of the method, determined from model data, are discussed. An algorithm, written in MATLAB format, is presented in the Appendix. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

2.
In this work, two different maximum likelihood approaches for multivariate curve resolution based on maximum likelihood principal component analysis (MLPCA) and on weighted alternating least squares (WALS) are compared with the standard multivariate curve resolution alternating least squares (MCR‐ALS) method. To illustrate this comparison, three different experimental data sets are used: the first one is an environmental aerosol source apportionment; the second is a time‐course DNA microarray, and the third one is an ultrafast absorption spectroscopy. Error structures of the first two data sets were heteroscedastic and uncorrelated, and the difference between them was in the existence of missing values in the second case. In the third data set about ultrafast spectroscopy, error correlation between the values at different wavelengths is present. The obtained results confirmed that the resolved component profiles obtained by MLPCA‐MCR‐ALS are practically identical to those obtained by MCR‐WALS and that they can differ from those resolved by ordinary MCR‐ALS, especially in the case of high noise. It is shown that methods that incorporate uncertainty estimations (such as MLPCA‐ALS and MCR‐WALS) can provide more reliable results and better estimated parameters than unweighted approaches (such as MCR‐ALS) in the case of the presence of high amounts of noise. The possible advantage of using MLPCA‐MCR‐ALS over MCR‐WALS is then that the former does not require changing the traditional MCR‐ALS algorithm because MLPCA is only used as a preliminary data pretreatment before MCR analysis. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

3.
A procedure for the selection of wavelength pairs as part of the ternary H-point standard addition method (t-HPSAM) is presented. The t-HPSAM is employed for simultaneous determination of Cu(II) and Zn(II) using murexide as chromogenic reagent. The procedure is based on principal component analysis (PCA), and its advantage is the ability to isolate the analyte signal, even in the presence of apparent nonlinearity. In the present system, non-quantitative complex formation of one of the metal ions with ligand was the reason for nonlinear appearance. In our nonlinear system, selection of the most appropriate pair of variables became possible by means of proper rotation of score and loading plots from PCA. The reliability of the proposed procedure was evaluated using model data. The total relative standard error (RSE) for applying t-HPSAM (coupled with the proposed selection method) to 15 samples in the ranges of 0.00–40.00µM Cu(II) and 0.00–16.00µM Zn was 2.56% for Cu and 6.54% for Zn.  相似文献   

4.
An algorithm called automatic window factor analysis (AUTOWFA) is developed for the purpose of determining, efficiently and automatically, the concentration profiles of the spectroscopically active components present in evolutionary processes, such as chemical titration, chromatography and kinetics. The method not only yields windows and profiles in agreement with those reported in the literature, but also reveals components not detected by precursor techniques. The method, however, has not been optimized and may require user interaction to fine-tune the windows.  相似文献   

5.
The nature and number of species in Yb(III)-daunomycin solutions have been investigated by graphical and computational methods both of which are based on matrix rank analyses. Determination of the rank of an absorbance matrix, which is the largest non-zero determinant of the matrix, permits the determination of the number of species in solution. The multiple regression analyses performed on the Yb(III)-daunomycin system enabled the determination of the nature of a species that exists at different metal to ligand ratios: ML2 at ratios less than 1.5, M2L at ratios greater than 15, and probably predominantly ML in between the two ratios. The matrix rank analyses approach is suited, it was found, to bracket the range of concentration ratios where a particular species predominates. While this is true by and large, it appears that the adequacy of the methods depends on the degree of complexity of the system.  相似文献   

6.
Multivariate methods, such as principal component analysis (PCA) and multivariate curve resolution (MCR), are often employed to aid the analysis of large complex data sets such as time‐of‐flight secondary ion mass spectrometry (ToF‐SIMS) images. There is, however, much confusion over the most appropriate choice of method for any given application and the effects of data preprocessing, which is exacerbated by the confusing terminologies and the use of jargon in this field. In the present study, a simple model system consisting of a ToF‐SIMS image of an immiscible polymer blend is used to evaluate PCA and MCR in the accurate identification, localisation and quantification of the phase‐separated polymer domains, using four data preprocessing methods (no scaling, normalisation, variance scaling and Poisson scaling). This highlights significant issues and challenges in the quantitative multivariate analysis of mixed organic systems, including the discrimination of chemically significant features from experimental noise, the resolution of weak chemical contributions and potential bias introduced by data preprocessing. Multivariate analysis using Poisson scaling, identified as the most suitable data preprocessing method for both PCA and MCR, demonstrates a marked improvement upon traditional (manual) analysis and provides valuable additional information that is difficult to detect using traditional analysis. Using these results, we present recommendations for the optimum use of multivariate analysis by analysts and provide guidance on selecting the most appropriate methods. Confusing terminology is also clarified. © Crown copyright 2008. Reproduced with the permission of Her Majesty's Stationery Office. Published by John Wiley & Sons, Ltd.  相似文献   

7.
The present state of the minimum assumption multivariate component resolution theory is outlined. Some new developments are presented: limiting function domains; the analytical expression for the limiting function; efficient algorithms for defining the FIRPOL and INNPOL hyperpolyhedrons. A very low resolution data set is analyzed.  相似文献   

8.
Sârbu C  Pop HF 《Talanta》2005,65(5):1215-1220
Principal component analysis (PCA) is a favorite tool in environmetrics for data compression and information extraction. PCA finds linear combinations of the original measurement variables that describe the significant variations in the data. However, it is well-known that PCA, as with any other multivariate statistical method, is sensitive to outliers, missing data, and poor linear correlation between variables due to poorly distributed variables. As a result data transformations have a large impact upon PCA. In this regard one of the most powerful approach to improve PCA appears to be the fuzzification of the matrix data, thus diminishing the influence of the outliers. In this paper we discuss and apply a robust fuzzy PCA algorithm (FPCA). The efficiency of the new algorithm is illustrated on a data set concerning the water quality of the Danube River for a period of 11 consecutive years. Considering, for example, a two component model, FPCA accounts for 91.7% of the total variance and PCA accounts only for 39.8%. Much more, PCA showed only a partial separation of the variables and no separation of scores (samples) onto the plane described by the first two principal components, whereas a much sharper differentiation of the variables and scores is observed when FPCA is applied.  相似文献   

9.
Maximum likelihood principal component analysis (MLPCA) was originally proposed to incorporate measurement error variance information in principal component analysis (PCA) models. MLPCA can be used to fit PCA models in the presence of missing data, simply by assigning very large variances to the non‐measured values. An assessment of maximum likelihood missing data imputation is performed in this paper, analysing the algorithm of MLPCA and adapting several methods for PCA model building with missing data to its maximum likelihood version. In this way, known data regression (KDR), KDR with principal component regression (PCR), KDR with partial least squares regression (PLS) and trimmed scores regression (TSR) methods are implemented within the MLPCA method to work as different imputation steps. Six data sets are analysed using several percentages of missing data, comparing the performance of the original algorithm, and its adapted regression‐based methods, with other state‐of‐the‐art methods. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

10.
Elements found in the edible parts of plants are considered to be the main source of nutrients for humans and animals. However, there is insufficient information on the relationship between heavy metal pollution in the growing soil of most edible plants. In this study, the distribution of elements in the edible forest nettle (Laportea alatipes) was evaluated as a function of geographical location. Forest land soils had higher concentrations of minor elements (Cu, Cr, Ni, and Zn) compared to soils from rural and suburban areas. Translocation factors for Cd and Pb showed effective translocation from the roots to the leaves; however, these heavy metals in leaves were still above South African maximum permissible levels for vegetables. Atmospheric depositions may play a significant role in higher Cd and Pb concentrations in the leaves. Bioaccumulation factors showed the plant to accumulate Cu, Mn, and Zn to meet physiological requirement levels. Geoaccumulation indices and enrichment factors showed no soil contamination or minimal enrichment by trace metals. Principal component analysis showed Co, Cr, Cu, Fe, Ni, Pb, and Zn in soil to originate from a common source which may be soil silicates and other minerals.  相似文献   

11.
Edible oils are valuable sources of nutrients, and their classification is necessary to ensure high quality, which is essential to food safety. This study reports the establishment of a rapid and straightforward SALDI-TOF MS platform used to detect triacylglycerol (TAG) in various edible oils. Silver nanoplates (AgNPts) were used to optimize the SALDI samples for high sensitivity and reproducibility of TAG signals. TAG fingerprints were combined with multivariate statistics to identify the critical features of edible oil discrimination. Eleven various edible oils were discriminated using principal component analysis (PCA). The results suggested the creation of a robust platform that can examine food adulteration and food fraud, potentially ensuring high-quality foods and agricultural products.  相似文献   

12.
A data‐based monitoring scheme is proposed to detect decomposition in low density polyethylene reactors by combining principal component analysis with a priori information on the heat balance equations around the reactor. During normal operating conditions, the heat balance equation should close at all times within reasonable limits. If excess heat is generated in the reactor, the heat balance closure error will exceed a user specified threshold limit to indicate the possible onset of decomposition. However, since precise information required to formulate the exact energy balance equations was not available, principal component analysis (PCA) was used as a model identification tool. Results from a number of decompositions case studies from an industrial low density polyethylene/ethylene vinyl acetate autoclave reactor indicate that the method was able to detect the onset of decomposition with reasonable lead time.

  相似文献   


13.
Computational and experimental approaches were adopted to utilize a chromophore diglycolic functionalized fluorescein derivative as a Ca2+ receptor. Fluorescein diglycolic acid (Fl-DGA, 1) was synthesized and used in multivariate determination of Ca2+ and K+. Full-structure computation shows that the complexes of 1 and Ca2+ have comparable energies regardless of additional interaction with lactone moiety. The initial formation of diglycolic-Ca2+ complex followed by macrocyclization is thermodynamically disfavored. A U-shaped pre-organized 1 allows Ca2+ to interact simultaneously with diglycolic and lactone motifs. Both motifs actively participate in Ca2+ recognition and the eleven methylene units in the undecyl arm provides excellent flexibility for reorganization and optimum interaction. Principal component analysis (PCA) of computational molecular properties reveals a simple method in evaluating motifs for cation recognition. Fragment models support full-structure results that negative charge causes significant structural changes, but do not reproduce the full extent of C-O bond breaking observed in the latter. Experimental optical responses show that 1 is selective towards Ca2+ and discriminates against K+ and Mg2+. PCA of emission intensities affords distinct clusters of 0.01, 0.1 and 1 mM Ca2+ and K+, and suggests applicability of this technique for simultaneous determination of cationic plant macronutrients in precision agriculture and a wide variety of other applications.  相似文献   

14.
Cistus is a plant that has been used in natural medicine for hundreds of years; it works primarily as an antioxidant and cleansing agent. Cistus × incanus leaves or herb can be an attractive source of polyphenols and flavonoids. The official protocols of active compound analysis relies on the extraction of compounds of interest from plant matter, which makes their determination long and costly. An analysis of plant material in its native state can be performed using vibrational spectroscopy. This paper presents a comparison of Raman spectroscopy, attenuated total reflection in mid-infrared and diffuse reflectance technique in the near-infrared region for the simultaneous quantification of total polyphenols (TPC) and flavonoids (TF) content, as well as the determination of FRAP antioxidant activity of C. incanus material. Utilizing vibrational spectra and using partial least squares algorithm, TPC and TF were quantified with the RSEPVAL errors in the 2.7–5.4% range, while FRAP antioxidant activity for validation sets was determined with relative errors ranged from 5.2 to 9.3%. For the analyzed parameters, the lowest errors of predictions were computed for models constructed using Raman data. The developed models allow for fast and precise quantification of the studied active compounds in C. incanus material without any chemical sample treatment.  相似文献   

15.
Pulsed laser‐induced autofluorescence spectra of pathologically certified normal and malignant colonic mucosal tissues were recorded at 325 nm excitation. The spectra were analysed using three different methods for discrimination purposes. First, all the spectra were subjected to the principal component analysis (PCA) and the discrimination between normal and malignant cases were achieved using parameters like, spectral residuals, Mahalanobis distance and scores of factors. Second, to understand the changes in tissue composition between the two classes (normal, and malignant), difference spectrum was constructed by subtracting mean spectrum of calibration set samples from simulated mean of all spectra of any one class (normal/malignant) and in third, artificial neural network (ANN) analysis was carried out on the same set of spectral data by training the network with spectral features like, mean, median, spectral residual, energy, standard deviation, number of peaks for different thresholds (100, 250 and 500) after carrying out 1st‐order differentiation of the training set samples and discrimination between normal and malignant conditions were achieved. The specificity and sensitivity were determined in PCA and ANN analyses and they were found to be 100 and 91.3% in PCA, and 100 and 93.47% in ANN, respectively. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

16.
选用30个结构多样的caM抑制剂分子作为数据集,采用多元线性回归(MLR)方法及主成分回归分析(PCA)方法对每个化合物的194个分子参数进行回归分析,分别建立了各自的最优预测模型.结果表明:多元线性回归分析方法所建模型与主成分回归所建模型相对比,发现逐步筛选法为最优建模方法?该方法所建模型统计结果良好(R2=0.952,SEE为0.289),应用于检验集时结果也比较令人满意(R2=0.941,SEP为0.295),模型表现出较强的可靠性和预测性.  相似文献   

17.
Recent years have seen the introduction of many surface characterization instruments and other spectral imaging systems that are capable of generating data in truly prodigious quantities. The challenge faced by the analyst, then, is to extract the essential chemical information from this overwhelming volume of spectral data. Multivariate statistical techniques such as principal component analysis (PCA) and other forms of factor analysis promise to be among the most important and powerful tools for accomplishing this task. In order to benefit fully from multivariate methods, the nature of the noise specific to each measurement technique must be taken into account. For spectroscopic techniques that rely upon counting particles (photons, electrons, etc.), the observed noise is typically dominated by ‘counting statistics’ and is Poisson in nature. This implies that the absolute uncertainty in any given data point is not constant, rather, it increases with the number of counts represented by that point. Performing PCA, for instance, directly on the raw data leads to less than satisfactory results in such cases. This paper will present a simple method for weighting the data to account for Poisson noise. Using a simple time‐of‐flight secondary ion mass spectrometry spectrum image as an example, it will be demonstrated that PCA, when applied to the weighted data, leads to results that are more interpretable, provide greater noise rejection and are more robust than standard PCA. The weighting presented here is also shown to be an optimal approach to scaling data as a pretreatment prior to multivariate statistical analysis. Published in 2004 by John Wiley & Sons, Ltd.  相似文献   

18.
本文应用高效液相色谱法.在C18反相柱上,以0.25mol/L乳酸作流动相,采取pH梯度.对16个稀土元素进行分离。除Dy-Y和Gd-Eu两对未完全分离外.其余稀土在55min全部洗脱达到分离、对完全分开的稀土元素,采用内标法同时定量测定.而两对元素Dy-Y和Gd-Eu采用铁消因子分析方法,进行了定量计算,取得了满意的结果。  相似文献   

19.
A 400‐MHz 1H nuclear magnetic resonance (NMR) spectroscopy and multivariate data analysis were used in the context of food surveillance to discriminate 46 authentic rice samples according to type. It was found that the optimal sample preparation consists of preparing aqueous rice extracts at pH 1.9. For the first time, the chemometric method independent component analysis (ICA) was applied to differentiate clusters of rice from the same type (Basmati, non‐Basmati long‐grain rice, and round‐grain rice) and, to a certain extent, their geographical origin. ICA was found to be superior to classical principal component analysis (PCA) regarding the verification of rice authenticity. The chemical shifts of the principal saccharides and acetic acid were found to be mostly responsible for the observed clustering. Among classification methods (linear discriminant analysis, factorial discriminant analysis, partial least squares discriminant analysis (PLS‐DA), soft independent modeling of class analogy, and ICA), PLS‐DA and ICA gave the best values of specificity (0.96 for both methods) and sensitivity (0.94 for PLS‐DA and 1.0 for ICA). Hence, NMR spectroscopy combined with chemometrics could be used as a screening method in the official control of rice samples. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

20.
Principal component analysis (PCA) and other multivariate analysis methods have been used increasingly to analyse and understand depth profiles in X‐ray photoelectron spectroscopy (XPS), Auger electron spectroscopy (AES) and secondary ion mass spectrometry (SIMS). These methods have proved equally useful in fundamental studies as in applied work where speed of interpretation is very valuable. Until now these methods have been difficult to apply to very large datasets such as spectra associated with 2D images or 3D depth‐profiles. Existing algorithms for computing PCA matrices have been either too slow or demanded more memory than is available on desktop PCs. This often forces analysts to ‘bin’ spectra on much more coarse a grid than they would like, perhaps even to unity mass bins even though much higher resolution is available, or select only part of an image for PCA analysis, even though PCA of the full data would be preferred. We apply the new ‘random vectors’ method of singular value decomposition proposed by Halko and co‐authors to time‐of‐flight (ToF)SIMS data for the first time. This increases the speed of calculation by a factor of several hundred, making PCA of these datasets practical on desktop PCs for the first time. For large images or 3D depth profiles we have implemented a version of this algorithm which minimises memory needs, so that even datasets too large to store in memory can be processed into PCA results on an ordinary PC with a few gigabytes of memory in a few hours. We present results from ToFSIMS imaging of a citrate crystal and a basalt rock sample, the largest of which is 134GB in file size corresponding to 67 111 mass values at each of 512 × 512 pixels. This was processed into 100 PCA components in six hours on a conventional Windows desktop PC. © 2015 The Authors. Surface and Interface Analysis published by John Wiley & Sons Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号