首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
An optimized model of multivariate classification for the monitoring of eighteen spring waters in the land of Serra St. Bruno, Calabria, Italy, has been developed. Thirty analytical parameters for each water source were investigated and reduced to eight by means of Principal Component Analysis (PCA). Water springs were grouped in five distinct classes by cluster techniques (CA) and a model for their classification was built by a Partial Least Squares–Discriminant Analysis (PLS–DA) procedure. The model was optimized and validated and then applied to new data matrices, containing the analytical parameters carried out on the same sources during the successive years. This model proved to be able to notice deviations of the global analytical characteristics, by pointing out in the course of time a different distribution of the samples within the classes. The variation of nitrate concentration was demonstrated to be the major responsible for the observed class shifts. The shifting sources were localized in areas used as sowable lands and high variability of nitrate content was ascribed to the practice of crop rotation, involving a varying use of the nitrogenous chemical fertilizers.  相似文献   

2.
Summary The performance of neural networks in classifying mass spectral data is evaluated and compared to methods of multivariate data analysis and pattern recognition. Back propagation networks are matched with linear discriminant analysis, Kohonen feature maps are compared to the knearest neighbour clustering algorithm. Eight classifiers were trained, in order to discriminate mass spectra of steroids from eight distinct classes of chemical compounds. The results obtained show slightly better performance of Kohonen networks compared to k-nearest neighbour clustering and equal performance of multi-layer perceptrons and discriminant analysis.  相似文献   

3.
Recently, NMR-based metabolomic analysis has been used to acquire information based on differentiation among biological samples. In the present study, we examined whether multivariate analysis was able to be applied to natural products and/or material field. Each extraction of 24 leaf samples, divided into six locations from the tip of the stem in each of four strains, was analyzed by pattern recognition methods, known as Principal Component Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA). Twenty-four extracts from mulberry leaf showed independent spectra by 1H NMR. The separation of leaf extraction data due to the difference at six locations was achieved in the PCA score plot as correlation PC1 (86.1%) and PC3 (4.6%) and showed two loading plots, suggesting classification by leaf position as an independent variable in the loading plot. Moreover, the difference among six locations clarified the seven highest discrimination powers by the SIMCA method. Meanwhile, the PCA score plot obtained classification by the variety of mulberry strains with three loading plots, but the SIMCA method did not give a peak by classification.  相似文献   

4.
Aruga R 《Talanta》1998,47(4):1053-1061
When it is not possible to analyze an exactly reproducible amount of sample (or whenever samples contain indefinite amounts of extraneous materials) it is customary to normalize the data by making, for example, the sum of the concentrations obtained for each sample equal to 100. Although the data normalized (or ;closed') in such a manner have been criticized, it is empirically shown that closure is appropriate in order to compare and classify samples of the type indicated above.  相似文献   

5.
The application of artificial neural networks for identifying water samples from different springs and rivers of Kharkiv based on the data about metal ions concentrations was studied. Using the river-water samples as an example, we demonstrated that the artificial neural networks enabled the correct identification of water samples, even if there were some gaps in the initial data. The procedure for determining the optimal number of neurons for synthesizing neural networks was proposed.  相似文献   

6.
Sixteen samples of three types (classes) of brain tissue were characterized by capillary gas chromatography (g.c.). Each sample is thus characterized by the peak heights of 105 peaks in each g.c. profile. SIMCA pattern recognition is used to analyze the 16 × 105 data matrix in order to differentiate between the three classes on the basis of the g.c. data only. The SIMCA method is therefore applicable even when the number of variables (105) exceeds the number of objects (16). The results indicate that g.c. profiles are useful for the identification of brain tissue type.  相似文献   

7.
8.
Variable scaling alters the covariance structure of data, affecting the outcome of multivariate analysis and calibration. Here we present a new method, variable stability (VAST) scaling, which weights each variable according to a metric of its stability. The beneficial effect of VAST scaling is demonstrated for a data set of 1H NMR spectra of urine acquired as part of a metabonomic study into the effects of unilateral nephrectomy in an animal model. The application of VAST scaling improved the class distinction and predictive power of partial least squares discriminant analysis (PLS-DA) models. The effects of other data scaling and pre-processing methods, such as orthogonal signal correction (OSC), were also tested. VAST scaling produced the most robust models in terms of class prediction, outperforming OSC in this aspect. As a result the subtle, but consistent, metabolic perturbation caused by unilateral nephrectomy could be accurately characterised despite the presence of much greater biological differences caused by normal physiological variation. VAST scaling presents itself as an interpretable, robust and easily implemented data treatment for the enhancement of multivariate data analysis.  相似文献   

9.
The determination of the contents of therapeutic drugs, metabolites and other important biomedical analytes in biological samples is usually performed by using high-performance liquid chromatography (HPLC). Modern multivariate calibration methods constitute an attractive alternative, even when they are applied to intrinsically unselective spectroscopic or electrochemical signals. First-order (i.e., vectorized) data are conveniently analyzed with classical chemometric tools such as partial least-squares (PLS). Certain analytical problems require more sophisticated models, such as artificial neural networks (ANNs), which are especially able to cope with non-linearities in the data structure. Finally, models based on the acquisition and processing of second- or higher-order data (i.e., matrices or higher dimensional data arrays) present the phenomenon known as “second-order advantage”, which permits quantitation of calibrated analytes in the presence of interferents. The latter models show immense potentialities in the field of biomedical analysis. Pertinent literature examples are reviewed.  相似文献   

10.
The use of near infrared (NIR) hyperspectral imaging and hyperspectral image analysis for distinguishing between hard, intermediate and soft maize kernels from inbred lines was evaluated. NIR hyperspectral images of two sets (12 and 24 kernels) of whole maize kernels were acquired using a Spectral Dimensions MatrixNIR camera with a spectral range of 960-1662 nm and a sisuChema SWIR (short wave infrared) hyperspectral pushbroom imaging system with a spectral range of 1000-2498 nm. Exploratory principal component analysis (PCA) was used on absorbance images to remove background, bad pixels and shading. On the cleaned images, PCA could be used effectively to find histological classes including glassy (hard) and floury (soft) endosperm. PCA illustrated a distinct difference between glassy and floury endosperm along principal component (PC) three on the MatrixNIR and PC two on the sisuChema with two distinguishable clusters. Subsequently partial least squares discriminant analysis (PLS-DA) was applied to build a classification model. The PLS-DA model from the MatrixNIR image (12 kernels) resulted in root mean square error of prediction (RMSEP) value of 0.18. This was repeated on the MatrixNIR image of the 24 kernels which resulted in RMSEP of 0.18. The sisuChema image yielded RMSEP value of 0.29. The reproducible results obtained with the different data sets indicate that the method proposed in this paper has a real potential for future classification uses.  相似文献   

11.
We discuss and evaluate the current state of second-order and higher-order multivariate calibration methods devoted to the determination of compounds in non-multilinear data systems. We examine possible causes of multilinearity deviations:
(1)
a non-linear relationship between signal and analyte concentration;
(2)
a signal for a given sample that is non-multilinear; and,
(3)
component profiles that are not constant across the different samples.
We discuss the advantages and the limitations of the algorithms available to cope with these different situations.The review covers relevant analytical problems found in samples of environmental and biological interest, highlighting some significant examples, and evaluating the advantages and the limitations of the different algorithms available.  相似文献   

12.
Eigenvectors reflect active processes (real principal components or factors). The concepts of inherent compatibility and global compatibility are introduced. The inherent compatibility coefficient is used to study individual processes, whereas the global compatibility coefficient is used to study the combined nature and effect of each of the processes. The inherent and global variable compatibility of triterpanes is derived from a data base of 216 oil samples, and important pathways of the formation of oil from sedimentary kerogen are elucidated. The inefficiency of conventional data analysis is demonstrated.  相似文献   

13.
Three‐level versions of Multilevel Simultaneous Component Analysis (MLSCA) and Multilevel Partial Least Squares (MLPLS) were developed, which are capable of separating between‐plant, between‐run and within‐run process variation, and modeling these three levels in a multivariate way. In comparison to the two‐level versions they allow to discriminate between overall differences between plants and the variation between runs within a plant. It was shown that the three‐level version of MLSCA has clear added value for the analysis of process runs from different plants. In MLPLS other projections of the multivariate data onto latent variables and different views of the data are obtained when relevant Y information is available. This has clear added value for obtaining insight into the relation between process data and Y. A special use of MLPLS is to diagnose aberrations in first principles models. In batch process monitoring MLSCA at three levels allows simultaneous multivariate modelling of batch data from different manufacturing plants. By filtering out the between‐plant and between‐run sources of variation, and using only within‐run variation, monitoring models can be improved. Using within‐run data, it is possible to build monitoring models across manufacturing units and reduce the number of nuisance alarms, while improving abnormal situation detection and diagnosis. Model transfer is only possible if static between‐plant differences exist, but not if there are dynamic differences.  相似文献   

14.
The present paper deals with the application of classical and fuzzy principal components analysis to a large data set from coastal sediment analysis. Altogether 126 sampling sites from the Atlantic Coast of the USA are considered and at each site 16 chemical parameters are measured. It is found that four latent factors are responsible for the data structure (“natural”, “anthropogenic”, “bioorganic”, and “organic anthropogenic”). Additionally, estimating the scatter plots for factor scores revealed the similarity between the sampling sites. Geographical and urban factors are found to contribute to the sediment chemical composition. It is shown that the use of fuzzy PCA helps for better data interpretation especially in case of outliers.  相似文献   

15.
Journal of Radioanalytical and Nuclear Chemistry - There are different standard procedures to measure 131I concentration in water samples. For environmental monitoring purposes the best...  相似文献   

16.
The applicability of two non-parametric extrapolation methods to FT-IR absorptance spectra is investigated. The first method minimizes those parts of the spectrum which do not satisfy a given constraint, while the second one changes them in each iteration through the true values.  相似文献   

17.
Accreditation and Quality Assurance - To analyze drinking water dataset, various statistical methods have been applied, including discriminant analysis, logistic regression and cluster analysis, to...  相似文献   

18.
The spiroorthoester synthesis includes several competitives reactions. A way of determining the reactions that are taking place and their sequential order, is presented. The reaction between the phenylglycidylether and gamma-butyrolactone to obtain a spiroorthoester has been monitored by near-infrared spectroscopy (NIR). In addition to the formation of the corresponding spiroorthoester, some parallel processes can occur. By means of two-dimensional correlation analysis, only one reaction is postulated, the one corresponding to the spiroorthoester formation. This was confirmed by recording the NMR spectra of the final product. Applying multivariate curve resolution-alternating least squares (MCR-ALS) to the NIR spectra obtained during the reaction, it has been possible to obtain the concentration values of the species involved in the reaction. The recovered spectra were compared with the experimentally recorded spectra for the reagents (phenylglycidylether, gamma-butyrolactone) and the final product (spiroorthoester) and the correlation coefficients were, in all cases, higher than 0.990. The maximum and minimum limits associated with the ALS solutions were calculated, making it possible to limit to a considerable extent the ambiguity that is characteristic of these curve resolution methods.  相似文献   

19.
A new way to represent and analyze DNA sequence data is described. This approach complements methods currently used, in that it allows the systematic part of the variation between different sequences to be modeled. This can prove as informative as absence of variation (homology), which is the most widely used criterion for comparing sequence data. A multivariate sequence-activity model (SAM), for DNA-promoter sequences is presented, by which the relative promoter strength is modeled in terms of the primary DNA-sequence. The model is shown to have a good predictive capability. The coefficients from the model are interpreted, and used to design new structures predicted to be strong promoters in the system investigated. The approach described is also applicable to other kinds of sequence data, e.g. RNAs, proteins or peptides.  相似文献   

20.
《Analytica chimica acta》2002,455(2):253-265
Human scalp hair samples of drug-free subjects and drug abusers (heroin and cocaine-heroin abusers) were analysed for trace metals by flame atomic absorption spectrometry (FAAS), flame atomic emission spectrometry (FAES) and electrothermal atomic absorption spectrometry (ETAAS). The classification of drug-free subjects and drug abuses groups with four multivariate methods using the metal contents in hair samples as discriminant variables has been discussed. Principal component analysis (PCA), cluster analysis (CA), linear discriminant analysis (LDA) and soft independent modelling of class analogy (SIMCA) allow distinguishing the two groups correctly. However, predictions by SIMCA are less satisfactory. Thirteen elements (Ag, Al, Ca, Cd, Cr, Cu, K, Mg, Mn, Na, Ni, Pb, and Zn) were determined by FAAS/FAES/ETAAS in 53 hair samples (16 samples of drug-free people and 37 samples of drug abusers). Human hair samples were prepared as aqueous slurries as sample pre-treatment and they were analysed using the slurry sampling technique. The half-range central value transformation was novelty used as data pre-treatment to homogenise the data. Grouping in the samples (drug-free people and drug abusers) were observed by using PCA and CA (squared Euclidean distance between objects and Ward method as clustering procedure). The application of LDA gave a correct recognition assignation percentage of 91.7 and 100.0% for the drug-free people and drug abusers, respectively, at a significance of 5%, while SIMCA offered recognition percentages of 83.3 and 91.3% for drug-free people and drug abusers, respectively, also at 5%. Finally, some studies were developed to classify heroin abusers and polidrug abusers (cocaine-heroin abusers) by the cited multivariate statistical methods. Recognition percentages of 90.9 and 100.0% were reached for heroin abusers and polidrug abusers groups, respectively, after LDA, while these percentages decreased to percentages lower than 90.0% when SIMCA was applied.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号