首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
The present paper deals with the application of classical and fuzzy principal components analysis to a large data set from coastal sediment analysis. Altogether 126 sampling sites from the Atlantic Coast of the USA are considered and at each site 16 chemical parameters are measured. It is found that four latent factors are responsible for the data structure (“natural”, “anthropogenic”, “bioorganic”, and “organic anthropogenic”). Additionally, estimating the scatter plots for factor scores revealed the similarity between the sampling sites. Geographical and urban factors are found to contribute to the sediment chemical composition. It is shown that the use of fuzzy PCA helps for better data interpretation especially in case of outliers.  相似文献   

2.
 This study deals with the application of chemometric approaches (cluster analysis and principal components analysis) to a potable water monitoring demonstrated on a data set from the region of Kavala, Greece, being analysed according to the standard instructions and directives of the European Union. It is shown that the data classification by cluster analysis and data structure modeling by principal components analysis reveals similar results, namely four different patterns of water source sites are identified depending on the geographical site location (near to Nestos river, near to Strimon river, elevated sites and near-to-coast sites). Three latent factors, explaining over 85% of the total variance, are responsible for the data structure as follows: “water acidity (anthropogenic)”, “water hardness (natural)” and the “marine factor”. Their importance for the different sites is related to the site location. Finally, it is recommended to involve the environmetric data treatment as a substantial standard procedure in assessment of the quality of water intended for human consumption. Received October 18, 2001; accepted June 24, 2002  相似文献   

3.
 The environmetrical analysis carried out has indicated that the short-term water quality survey may give a very important information on the latent factors influencing the water quality of Yantra river basin. The principal components analysis carried out reveals that at least four principal components are necessary for multivariate statistical modeling of the water quality – combination of natural and anthropogenic influences (“mixed” factor) reflecting parameters such as water hardness, marine influence, organic pollution; typical anthropogenic influences (“anthropogenic” factor) explaining the metal contamination of the river water; everyday wastes, usually N-containing pollutants such as nitrates, nitrites or ammonia, form the “N-containing wastes” factor and a “temperature” factor formed by typical physical parameters such as water and air temperature. The formation of these special features of the river waters from Yantra basin is also confirmed by the results of cluster analysis (variable clustering) where the content of the significant clusters of the variables is the same as the content of the principal components modeling over 75% of the total variance of the system. Additionally, the cluster analysis of the objects has proved that the water quality during both sampling traverses is very stable and reproducible. Few exceptions are observed due to momentary local pollution in an industrial area along the river stream. Comparison with standard requirements for water quality has indicated that the Yantra river waters are of high quality and could be used after minor pretreatment as potable water sources. The environmetrical approaches applied reveal a specific information concerning the river water quality. In this way the ecological problem treated has not a local importance but suggests a strategy for estimation of similar ecosystems in global sense. Received July 30, 1998. Revision June 1, 1999.  相似文献   

4.
Multivariate statistical analysis of sediment data (information matrix 123 × 16) from the Gulf of Mexico, USA shows that the data structure is defined by four latent factors conditionally called “inorganic natural”, “inorganic anthropogenic”, “bioorganic” and “organic anthropogenic” explaining 39.24%, 23.17%, 10.77% and 10.67% of the total variance of the data system, respectively. The receptor model obtained by the application of the PCR approach makes it possible to apportion the contribution of each chemical component for the latent factor formation. A separation of the contribution of each chemical parameter is achieved within the frames of “natural” and “anthropogenic” origin of the respective heavy metal or organic matter to the sediment formation process. This is a new approach as compared to the traditional “one dimensional” search with a limited number of preliminary selected tracer components. The model suggested divides natural from anthropogenic influences and allows in this way each participant in the sediment formation process to be used as marker of either natural or anthropogenic effects. Received: 20 March 1999 / Revised: 1 June 1999 / Accepted: 3 June 1999  相似文献   

5.
Multivariate statistical analysis of sediment data (information matrix 123 × 16) from the Gulf of Mexico, USA shows that the data structure is defined by four latent factors conditionally called “inorganic natural”, “inorganic anthropogenic”, “bioorganic” and “organic anthropogenic” explaining 39.24%, 23.17%, 10.77% and 10.67% of the total variance of the data system, respectively. The receptor model obtained by the application of the PCR approach makes it possible to apportion the contribution of each chemical component for the latent factor formation. A separation of the contribution of each chemical parameter is achieved within the frames of “natural” and “anthropogenic” origin of the respective heavy metal or organic matter to the sediment formation process. This is a new approach as compared to the traditional “one dimensional” search with a limited number of preliminary selected tracer components. The model suggested divides natural from anthropogenic influences and allows in this way each participant in the sediment formation process to be used as marker of either natural or anthropogenic effects. Received: 20 March 1999 / Revised: 1 June 1999 / Accepted: 3 June 1999  相似文献   

6.
The sustainable development rule implementation is tested by the application of chemometrics in the field of environmental pollution. A data set consisting of Cd, Pb, Cr, Zn, Cu, Mn, Ni, and Fe content in bottom sediment samples collected in the Odra River (Germany/Poland) is treated using cluster analysis (CA), principal component analysis (PCA), and source apportionment techniques. Cluster analysis clearly shows that pollution on the German bank is higher than on the Polish bank. Two latent factors extracted by PCA explain over 88 % of the total variance of the system, allowing identification of the dominant “semi-natural” and “anthropogenic” pollution sources in the river ecosystem. The complexity of the system is proved by MLR analysis of the absolute principal component scores (APCS). The apportioning clearly shows that Cd, Pb, Cr, Zn and Cu participate in an “anthropogenic” source profile, whereas Fe and Mn are “semi-natural”. Multiple regression analysis indicates that for particular elements not described by the model, the amounts vary from 4.2 % (Mn) to 13.1 % (Cr). The element Ni participates to some extent to each source and, in this way, is neither pure “semi-natural” nor pure “anthropogenic”. Apportioning indicates that the whole heavy metal pollution in the investigated river reach is 12510.45 mg·kg−1. The contribution of pollutants originating from “anthropogenic sources” is 9.04 % and from “semi-natural” sources is 86.53 %.  相似文献   

7.
Multivariate statistical assessment of polluted soils   总被引:9,自引:0,他引:9  
This study deals with the application of several multivariate statistical methods (cluster analysis, principal components analysis, multiple regression on absolute principal components scores) for assessment of soil pollution by heavy metals. The sampling was performed in a heavily polluted region and the chemometric analysis revealed four latent factors, which describe 84.5 % of the total variance of the system, responsible for the data structure. These factors, whose identity was proved also by cluster analysis, were conditionally named “ore specific”, “metal industrial”, “cement industrial”, and “steel production” factors. Further, the contribution of each identified factor to the total pollution of the soil by each metal pollutant in consideration was determined.  相似文献   

8.
The ability of multivariate analysis methods such as hierarchical cluster analysis, principal component analysis and partial least squares-discriminant analysis (PLS-DA) to achieve olive oil classification based on the olive fruit varieties from their triacylglycerols profile, have been investigated. The variations in the raw chromatographic data sets of 56 olive oil samples were studied by high-temperature gas chromatography with (ion trap) mass spectrometry detection. The olive oil samples were of four different categories (“extra-virgin olive oil”, “virgin olive oil”, “olive oil” and “olive-pomace” oil), and for the “extra-virgin” category, six different well-identified olive oil varieties (“hojiblanca”, “manzanilla”, “picual”, “cornicabra”, “arbequina” and “frantoio”) and some blends of unidentified varieties. Moreover, by pre-processing methods of chemometric (to linearise the response of the variables) such as peak-shifting, baseline (weighted least squares) and mean centering, it was possible to improve the model and grouping between different varieties of olive oils. By using the first three principal components, it was possible to account for 79.50% of the information on the original data. The fitted PLS-DA model succeeded in classifying the samples. Correct classification rates were assessed by cross-validation.  相似文献   

9.
The present paper deals with chemometric interpretation of soil analysis data collected from 31 sampling sites in the region of Kavala and Drama, Northern Greece. The determination of 16 different chemical and physicochemical characteristics is principally needed for prognosis of the land treatment and fertilizing. The study carried out indicates that the application of multivariate statistical approaches could reveal new and specific information about sampling sites. It has been found that they could be divided into four general patterns: pattern 1 contains dominantly inorganic and alkaline soil samples from semi-mountainous regions in close proximity to the seacoast; pattern 2 indicates the same soil sample type and regional location as pattern 1 but is far from the coastal line; pattern 3 includes samples from sites from the plains with organic and alkaline soils with close proximity to the coast; pattern 4 resembles pattern 3 as soil type but involves samples from sites far from the shore. Further, six latent factors were identified, conditionally named “structural”, “acidic”, “nutritional”, “salt”, “microcomponents” and “organic”. Finally, an apportioning procedure was carried out to find the source contributions in the measured analytical values. In this way the routine estimation of the soil quality could be improved.  相似文献   

10.
This work explores a novel method for rearranging 1st order (one-way) infra-red (IR) and/or near infra-red (NIR) ordinary spectra into a representation suitable for multi-way modelling and analysis. The method is based on the fact that the fundamental IR absorption and the first, second, and consecutive overtones of NIR absorptions represent identical chemical information. It is therefore possible to rearrange these overtone regions of the vectors comprising an IR and NIR spectrum into a matrix where the fundamental, 1st, 2nd, and consecutive overtones of the spectrum are arranged as either rows or columns in a matrix, resulting in a true three-way tensor of data for several samples. This tensorization facilitates explorative analysis and modelling with multi-way methods, for example parallel factor analysis (PARAFAC), N-way partial least squares (N-PLS), and Tucker models. The vibrational overtone combination spectroscopy (VOCSY) arrangement is shown to benefit from the “order advantage”, producing more robust, stable, and interpretable models than, for example, the traditional PLS modelling method. The proposed method also opens the field of NIR for true peak decomposition—a feature unique to the method because the latent factors acquired using PARAFAC can represent pure spectral components whereas latent factors in principal component analysis (PCA) and PLS usually do not.  相似文献   

11.
Multivariate Statistical Assessment of Air Quality: A Case Study   总被引:1,自引:0,他引:1  
The present paper deals with the application of several chemometrical methods (cluster and principal components analysis, source apportioning on absolute principal components scores) to an aerosol data collection from Unterloibach, Austria. It is shown that seven latent factors explaining almost 80% of the total variance are responsible for the data structure and are conditionally identified as secondary aerosol, mineral dust, oil burning, lead smelter, coal burning, salt and fertilizer emission sources. Furthermore, the contribution of each identified source to the formation of the particle total mass and chemical compounds total concentration is calculated. Thus, a reliable assessment of the air quality in the region is performed. The requirements of the sustainability concept for ecological indicators in this case is easily transformed into a multivariate statistical problem taking into account not separate indicators but the specific multivariate nature of aerosol pollution.  相似文献   

12.
Multivariate statistical analysis of sediment data (input matrix 122 × 15) collected from 122 sampling sites from the western coastline of the USA and analyzed for 15 analytes indicates that the data structure could be explained by four latent factors. These factors are conditionally named “anthropogenic”, “organic”, “natural”, and “hot spots”. They explain over 85% of the total variance of the data system, which is an acceptable value for the PCA model. The receptor models obtained after regression of the mass on the absolute principal components scores ensures reliable estimation of the contribution of each possible natural or anthropogenic source to the mass of each chemical component. It can be concluded that the region of interest reveals a different pattern of pollution compared with the eastern coastline treated statistically in a previous study.  相似文献   

13.
Multivariate statistical analysis of sediment data (input matrix 122 x 15) collected from 122 sampling sites from the western coastline of the USA and analyzed for 15 analytes indicates that the data structure could be explained by four latent factors. These factors are conditionally named "anthropogenic", "organic", "natural", and "hot spots". They explain over 85% of the total variance of the data system, which is an acceptable value for the PCA model. The receptor models obtained after regression of the mass on the absolute principal components scores ensures reliable estimation of the contribution of each possible natural or anthropogenic source to the mass of each chemical component. It can be concluded that the region of interest reveals a different pattern of pollution compared with the eastern coastline treated statistically in a previous study.  相似文献   

14.
 Recently a basis-set-superposition-error-free second-order perturbation theory was introduced based on the “chemical Hamiltonian approach” providing the full antisymmetry of all wave functions by using second quantization. Subsequently, the “Heitler–London” interaction energy corresponding to the sum of the zero- and first-order perturbational energy terms was decomposed into different physically meaningful components, like electrostatics, exchange and overlap effects. The first-order wave function obtained in the framework of this perturbation theory also consists of terms having clear physical significance: intramolecular correlation, polarization, charge transfer, dispersion and combined polarization–charge transfer excitations. The second-order energy, however, does not represent a simple sum of the respective contributions, owing to the intermolecular overlap. Here we propose an approximate energy decomposition scheme by defining some “partial Hylleraas functionals” corresponding to the different physically meaningful terms of the first-order wave functions. The sample calculations show that at large and intermediate intermolecular distances the total second-order intermolecular interaction energy contribution is practically equal to the sum of these “physical” terms, while at shorter distances the overlap-caused interferences become of increasing importance. Received: 18 June 2001 / Accepted: 28 August 2001 / Published online: 16 November 2001  相似文献   

15.
The estimation of uncertainty in organic elemental analysis for C, H, N and S is reported. Both “bottom up” and “top down” strategies are used for uncertainty calculations. The bottom up approach used the results of C, H, N, and S obtained from the homogeneity study of two pure chemicals (toluene-4-sulfonamide and 4(6)-methyl-2-thiouracil). Two calibration systems, K factor and calibration curve, were applied in this study and no significant differences were obtained. For the “top down” approach, we used the data obtained from a proficiency test on both pure chemicals from among 45 Spanish laboratories. Both approaches are compared and discussed below.  相似文献   

16.
Lateral flow (immuno)assays are currently used for qualitative, semiquantitative and to some extent quantitative monitoring in resource-poor or non-laboratory environments. Applications include tests on pathogens, drugs, hormones and metabolites in biomedical, phytosanitary, veterinary, feed/food and environmental settings. We describe principles of current formats, applications, limitations and perspectives for quantitative monitoring. We illustrate the potentials and limitations of analysis with lateral flow (immuno)assays using a literature survey and a SWOT analysis (acronym for “strengths, weaknesses, opportunities, threats”). Articles referred to in this survey were searched for on MEDLINE, Scopus and in references of reviewed papers. Search terms included “immunochromatography”, “sol particle immunoassay”, “lateral flow immunoassay” and “dipstick assay”.  相似文献   

17.
It has long been realized that connected graphs have some sort of geometric structure, in that there is a natural distance function (or metric), namely, the shortest-path distance function. In fact, there are several other natural yet intrinsic distance functions, including: the resistance distance, correspondent “square-rooted” distance functions, and a so‐called “quasi‐Euclidean” distance function. Some of these distance functions are introduced here, and some are noted not only to satisfy the usual triangle inequality but also other relations such as the “tetrahedron inequality”. Granted some (intrinsic) distance function, there are different consequent graph-invariants. Here attention is directed to a sequence of graph invariants which may be interpreted as: the sum of a power of the distances between pairs of vertices of G, the sum of a power of the “areas” between triples of vertices of G, the sum of a power of the “volumes” between quartets of vertices of G, etc. The Cayley–Menger formula for n-volumes in Euclidean space is taken as the defining relation for so-called “n-volumina” in terms of graph distances, and several theorems are here established for the volumina-sum invariants (when the mentioned power is 2). This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

18.
Hierarchical cluster analysis (HCA) and principal components analysis (PCA) were applied to find groups between similar depth-profiles in thin-layers investigated by Rutherford backscattering spectrometry (RBS).HCA yields in one run an objective hierarchy of similarity for several profiles. Among the similarity coefficients examined the linear measure, the Euclidean distance and the exponential measure respond with different sensitivity to overall shifts in direction of the concentration axis, whereas the correlation measure relates to parallelism of the profiles.For agglomerative HCA with Euclidean distance, a lowest significant linkage level has been defined by use of Fisher'sF-test. For divisive HCA based also on Euclidean distance, the maximum of a separating function marks the most separating cluster step. The outcomes of both proposals agree for the data set investigated.PCA is useful for verifying the results of HCA and yields additional information about the data structure. In the actual example quite different positions of features (concentrations at definite depths) in the space of the two first principal components hint at peculiarities during the metallurgical coating process.  相似文献   

19.
Cross‐validation has become one of the principal methods to adjust the meta‐parameters in predictive models. Extensions of the cross‐validation idea have been proposed to select the number of components in principal components analysis (PCA). The element‐wise k‐fold (ekf) cross‐validation is among the most used algorithms for principal components analysis cross‐validation. This is the method programmed in the PLS_Toolbox, and it has been stated to outperform other methods under most circumstances in a numerical experiment. The ekf algorithm is based on missing data imputation, and it can be programmed using any method for this purpose. In this paper, the ekf algorithm with the simplest missing data imputation method, trimmed score imputation, is analyzed. A theoretical study is driven to identify in which situations the application of ekf is adequate and, more importantly, in which situations it is not. The results presented show that the ekf method may be unable to assess the extent to which a model represents a test set and may lead to discard principal components with important information. On a second paper of this series, other imputation methods are studied within the ekf algorithm. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

20.
Statistical techniques, when applied to data obtained by chemical investigations on ancient artworks, are usually expected to recognize groups of objects to classify the archeological finds, to attribute the provenance of items compared with earlier investigated ones, or to determine whether an archaelogical attribution is possible or not. The statistical technique most frequently used in archeometry is the principal component analysis (PCA), because of its simplicity in theory and implementation. However, the application of PCA to archeometric data showed severe limitations because of its linear feature. Indeed, PCA is inadequate to classify data whose behavior describe a curve or a curved subspace of the original data space. As a consequence of it, an amount of information is lost because the multi‐dimensional data space is compressed into a lower‐dimensional subspace including principal components. The aim of this work is then to test a novel statistical technique for archeometry. We propose a nonlinear PCA method to extract maximum chemical information by plotting data on the smallest number of principal components and to answer archeological questions. The higher accuracy and effectiveness of nonlinear PCA approach with respect to standard PCA for the analysis of archeometric data are shown through the study of Apulian red figured pottery (fifth–fourth century BC) coming from some of the most relevant archeological sites of ancient Apulia (Monte Sannace (Gioia del Colle), Egnatia (Fasano), Canosa, Altamura, Conversano, and Arpi(Foggia)). Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号