首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
With the aim of obtaining a monitoring tool to assess the quality of water, a multivariate statistical procedure based on cluster analysis (CA) coupled with soft independent modelling class analogy (SIMCA) algorithm, providing an effective classification method, is proposed. The experimental data set, carried out throughout the year 2004, was composed of analytical parameters from 68 water sources in a vast southwest area of Paris. Nine variables carrying the most useful information were selected and investigated (nitrate, sulphate, chloride, turbidity, conductivity, hardness, alkalinity, coliforms and Escherichia coli). Principal component analysis provided considerable data reduction, gathering in the first two principal components the majority of information representing about 92.2% of the total variance. CA grouped samples belonging to different sites, distinctly correlating them with chemical variables, and a classification model was built by SIMCA. This model was optimised and validated and then applied to a new data matrix, consisting of the parameters measured during the year 2005 from the same objects, providing a fast and accurate classification of all the samples. The most of the examined sources appeared unchanged during the 2-year period, but five sources resulted distributed in different classes, due to statistical significant changes of some characteristic analytical parameters.  相似文献   

2.
This paper describes a clustering method on three‐way arrays making use of an exploratory visualization approach. The aim of this study is to cluster samples in the object mode of a three‐way array, which is done using the scores (sample loadings) of a three‐way factor model, for example, a Tucker3 or a PARAFAC model. Further, tools are developed to explore and identify reasons for particular clusters by visually mining the data using the clustering results as guidance. We introduce a three‐way clustering tool and demonstrate our results on a metabolite profiling dataset. We explore how high performance liquid chromatography (HPLC) measurements of commercial extracts of St. John's wort (natural remedies for the treatment of mild to moderate depression) differ and which chemical compounds account for those differences. Using common distance measures, for example, Euclidean or Mahalanobis, on the scores of a three‐way model, we verify that we can capture the underlying clustering structure in the data. Beside this, by making use of the visualization approach, we are able to identify the variables playing a significant role in the extracted cluster structure. The suggested approach generalizes straightforwardly to higher‐order data and also to two‐way data. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

3.
A direct conformational clustering and mapping approach for peptide conformations based on backbone dihedral angles has been developed and applied to compare conformational sampling of Met-enkephalin using two molecular dynamics (MD) methods. Efficient clustering in dihedrals has been achieved by evaluating all combinations resulting from independent clustering of each dihedral angle distribution, thus resolving all conformational substates. In contrast, Cartesian clustering was unable to accurately distinguish between all substates. Projection of clusters on dihedral principal component (PCA) subspaces did not result in efficient separation of highly populated clusters. However, representation in a nonlinear metric by Sammon mapping was able to separate well the 48 highest populated clusters in just two dimensions. In addition, this approach also allowed us to visualize the transition frequencies between clusters efficiently. Significantly, higher transition frequencies between more distinct conformational substates were found for a recently developed biasing-potential replica exchange MD simulation method allowing faster sampling of possible substates compared to conventional MD simulations. Although the number of theoretically possible clusters grows exponentially with peptide length, in practice, the number of clusters is only limited by the sampling size (typically much smaller), and therefore the method is well suited also for large systems. The approach could be useful to rapidly and accurately evaluate conformational sampling during MD simulations, to compare different sampling strategies and eventually to detect kinetic bottlenecks in folding pathways.  相似文献   

4.
Producing good low‐dimensional representations of high‐dimensional data is a common and important task in many data mining applications. Two methods that have been particularly useful in this regard are multidimensional scaling and nonlinear mapping. These methods attempt to visualize a set of objects described by means of a dissimilarity or distance matrix on a low‐dimensional display plane in a way that preserves the proximities of the objects to whatever extent is possible. Unfortunately, most known algorithms are of quadratic order, and their use has been limited to relatively small data sets. We recently demonstrated that nonlinear maps derived from a small random sample of a large data set exhibit the same structure and characteristics as that of the entire collection, and that this structure can be easily extracted by a neural network, making possible the scaling of data set orders of magnitude larger than those accessible with conventional methodologies. Here, we present a variant of this algorithm based on local learning. The method employs a fuzzy clustering methodology to partition the data space into a set of Voronoi polyhedra, and uses a separate neural network to perform the nonlinear mapping within each cell. We find that this local approach offers a number of advantages, and produces maps that are virtually indistinguishable from those derived with conventional algorithms. These advantages are discussed using examples from the fields of combinatorial chemistry and optical character recognition. © 2001 John Wiley & Sons, Inc. J Comput Chem 22: 373–386, 2001  相似文献   

5.
Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance/correlation matrix of the analyzed data. However, to properly work with high-dimensional data sets, PCA poses severe mathematical constraints on the minimum number of different replicates, or samples, that must be included in the analysis. Generally, improper sampling is due to a small number of data respect to the number of the degrees of freedom that characterize the ensemble. In the field of life sciences it is often important to have an algorithm that can accept poorly dimensioned data sets, including degenerated ones. Here a new random projection algorithm is proposed, in which a random symmetric matrix surrogates the covariance/correlation matrix of PCA, while maintaining the data clustering capacity. We demonstrate that what is important for clustering efficiency of PCA is not the exact form of the covariance/correlation matrix, but simply its symmetry.  相似文献   

6.
In chemometrics, the supervised and unsupervised classification of high‐dimensional data has become a recurrent problem. Model‐based techniques for discriminant analysis and clustering are popular tools, which are renowned for their probabilistic foundations and their flexibility. However, classical model‐based techniques show a disappointing behaviour in high‐dimensional spaces, which up to now have been limited in their use within chemometrics. The recent developments in model‐based classification overcame these drawbacks and enabled the efficient classification of high‐dimensional data, even in the ‘small n / large p’ condition. This work presents a comprehensive review of these recent approaches, including regularization‐based techniques, parsimonious modelling, subspace classification methods and classification methods based on variable selection. The use of these model‐based methods is also illustrated on real‐world classification problems in chemometrics using R packages. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

7.
DART (Direct Analysis in Real Time) coupled with Time‐of‐Flight Mass Spectrometry (TOF/MS) has been used for analyses of ice‐teas. The article focuses on quality and authenticity of ice‐teas as one of the most important tea‐based products on the market. Twenty‐one samples of ice‐teas (black and green) were analysed. Selected compounds of ice‐teas were determined: theobromine, caffeine, total phenolic compounds, total soluble solids, total amino acid concentration, preservatives and saccharides were determined. Fingerprints of DART‐TOF/MS spectra were used for comprehensive assessment of the ice‐tea samples. The DART‐TOF/MS method was used for monitoring the following compounds: citric acid, caffeine, saccharides, artificial sweeteners (saccharin, acesulphame K), and preservatives (sorbic and benzoic acid), phosphoric acid and phenolic compounds. The measured data were subjected to a principal components analysis. The HPLC and DART‐TOF/MS methods were compared in terms of determination of selected compounds (caffeine, benzoic acid, sorbic acid and saccharides) in the ice‐teas. The DART‐TOF/MS technique seems to be a suitable method for fast screening, testing quality and authenticity of tea‐based products. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

8.
Airborne particulate matter is an important component of atmospheric pollution, affecting human health, climate, and visibility. Modern instruments allow single particles to be analyzed one-by-one in real time, and offer the promise of determining the sources of individual particles based on their mass spectral signatures. The large number of particles to be apportioned makes clustering a necessary step. The goal of this study is to compare using mass spectral data the accuracy and speed of several clustering algorithms: ART-2a, several variants of hierarchical clustering, and K-means. Repeated simulations with various algorithms and different levels of data preprocessing suggest that hierarchical clustering methods using derivatives of Ward's algorithm discriminate sources with fewer errors than ART-2a, which itself discriminates much better than point-wise hierarchical clustering methods. In most cases, K-means algorithms do almost as well as the best hierarchical clustering. These efficient algorithms (clustering derived from Ward's algorithm, ART-2a and K-means) are most accurate when the relative peak areas have been pre-scaled by taking the square root. Analysis times vary within a factor of 30, and when accuracy above 95% is required, run times scale up as the square of the number of particles. Algorithms derived from Ward's remain the most accurate under a wide range of conditions and conversely, for an equal accuracy, can deliver a shorter list of clusters, allowing faster and maybe on-the-fly classification.  相似文献   

9.
A simple and efficient method was developed for the chemical fingerprint analysis and simultaneous determination of four phenylnaphthalene‐type lignans in Vitex negundo seeds using high‐performance liquid chromatography with diode array detection. For fingerprint analysis, 13 V. negundo seed samples were collected from different regions in China, and the fingerprint chromatograms were matched by the computer‐aided Similarity Evaluation System for Chromatographic Fingerprint of TCM (Version 2004A). A total of 21 common peaks found in all the chromatograms were used for evaluating the similarity between these samples. Additionally, simultaneous quantification of four major bioactive ingredients was conducted to assess the quality of V. negundo seeds. Our results indicated that the contents of four lignans in V. negundo seeds varied remarkably in herbal samples collected from different regions. Moreover, the hierarchical clustering analysis grouped these 13 samples into three categories, which was consistent with the chemotypes of those chromatograms. The method developed in this study provides a substantial foundation for the establishment of reasonable quality control standards for V. negundo seeds.  相似文献   

10.
Accurate clustering of cells from single-cell RNA sequencing (scRNA-seq) data is an essential step for biological analysis such as putative cell type identification. However, scRNA-seq data has high dimension and high sparsity, which makes traditional clustering methods less effective to reflect the similarity between cells. Since genetic network fundamentally defines the functions of cell and deep learning shows strong advantages in network representation learning, we propose a novel scRNA-seq clustering framework ScGSLC based on graph similarity learning. ScGSLC effectively integrates scRNA-seq data and protein-protein interaction network to a graph. Then graph convolution network is employed by ScGSLC to embedding graph and clustering the cells by the calculated similarity between graphs. Unsupervised clustering results of nine public data sets demonstrate that ScGSLC shows better performance than the state-of-the-art methods.  相似文献   

11.
曹稳  洪亮  杨明  李绍平  赵静 《色谱》2021,39(9):1006-1011
《中国药典》收载的发酵虫草菌粉产品的质量标准中,规定以鸟苷、腺苷、尿苷的含量作为评价相关产品质量的标准.但除此之外,还有许多其他的核苷类成分对发酵虫草菌粉质量控制的影响尚未被探讨.为探究发酵虫草菌粉及产品质控指标选择的合理性,采用超高效液相色谱-紫外检测法对19批发酵虫草菌粉及产品中9种核苷成分(尿嘧啶、胞苷、鸟嘌呤、...  相似文献   

12.
Sârbu C  Moţ AC 《Talanta》2011,85(2):1112-1117
The fingerprinting capacity of thin layer chromatography (TLC) and image analysis in the case of propolis samples collected in different area in Romania has been investigated. Fuzzy divisive hierarchical clustering approach was used as a powerful tool of samples discrimination and fingerprinting according to the geographical origin and local flora. The fuzzy partition and patterns obtained by membership degrees plot were in a very good agreement with floral origin and geographic location of Romanian propolis samples, and clearly illustrate the fuzziness concerning their similarities and difference. The results obtained strongly support that TLC via image analysis can be successfully employed in the fingerprinting methodologies if they are combined with appropriate fuzzy clustering method. The method developed in this paper might be also extended in the authenticity and origin control of fruits, herbs or derived products.  相似文献   

13.
The imidazolinium and benzimidazolium bromide salts with pentafluor substituents on N atom were synthesized. The structures of imidazolinium and benzimidazolium bromide salts obtained were conformed by 1H and 13C NMR, 19F NMR and elemental analysis. It was found that pyrolytic decomposition occurs with melting in salts. The imidazolinium and benzimidazolium bromide salts were studied by TG-DTG and DTA from ambient temperature to 1000°C in nitrogen atmosphere. The decomposition occurred mainly in one stage and the values of activation energy E, frequency factor A, reaction order n, enthalpy change ΔH #, entropy change ΔS # and Gibbs free energy ΔG #, of the thermal decomposition were calculated by means of Coats-Redfern (CR), MacCallum-Tanner (MC) and van Krevelen (vK) methods. The activation energy value obtained by CR and MC methods were in good agreement with each other while those obtained by vK were found to be 10–12 kJ mol−1 larger.  相似文献   

14.
Beer stability is a major concern for the brewing industry, as beer characteristics may be subject to significant changes during storage. This paper describes a novel non-targeted methodology for monitoring the chemical changes occurring in a lager beer exposed to accelerated aging (induced by thermal treatment: 18 days at 45 °C), using gas chromatography-mass spectrometry in tandem with multivariate analysis (GC-MS/MVA). Optimization of the chromatographic run was performed, achieving a threefold reduction of the chromatographic time. Although losing optimum resolution, rapid GC runs showed similar chromatographic profiles and semi-quantitative ability to characterize volatile compounds. To evaluate the variations on the global volatile signature (chromatographic profile and m/z pattern of fragmentation in each scan) of beer during thermal deterioration, a non-supervised multivariate analysis method, Principal Component Analysis (PCA), was applied to the GC-MS data. This methodology allowed not only the rapid identification of the degree of deterioration affecting beer, but also the identification of specific compounds of relevance to the thermal deterioration process of beer, both well established markers such as 5-hydroxymethylfufural (5-HMF), furfural and diethyl succinate, as well as other compounds, to our knowledge, newly correlated to beer aging.  相似文献   

15.
The complexity of the processes occurring during cobalt oxalate dihydrate (COD) decomposition indicates that an interpretation of the mechanism based only on the TG curve is of little value. Mass change alone does not allow deeper insight into all of the potential primary and secondary reactions that could occur. The observed mass changes (TG) and thermal effects (DTA/DSC) are a superposition of several phenomena and thus do not necessarily reflect COD decomposition alone. Investigation of the mechanism of decomposition requires the application of different simultaneous techniques that allow the qualitative and quantitative determination of the composition of the gaseous products. Composition of the solid and gaseous products of COD decomposition and heats of dehydration and oxalate decomposition were determined for inert, oxidizing and hydrogen-containing atmospheres. Contrary to previous suggestions about the mechanism of cobalt oxalate decomposition, the solid product formed during decomposition in helium contains not only metallic Comet, but also a substantial amount of CoO (ca 13 mol%). In all atmospheres, the composition of the primary solid and gaseous products changes as a result of secondary gas-solid and gas-gas reactions, catalyzed by freshly formed Comet. The course of the following reactions has been investigated under steady-state and transient conditions characteristic for COD decomposition: water gas shift, Fischer-Tropsch, CO disproportionation, CoO reduction by CO and H2, Comet oxidation under rich and lean oxygen conditions. This revised version was published online in August 2006 with corrections to the Cover Date.  相似文献   

16.
Preclassification of raw infrared spectra has often been neglected in scientific literature. Separating spectra of low spectral quality, due to low signal-to-noise ratio, presence of artifacts, and low analyte presence, is crucial for accurate model development. Furthermore, it is very important for sparse data, where it becomes challenging to visually inspect spectra of different natures. Hence, a preclassification approach to separate infrared spectra for sparse data is needed. In this study, we propose a preclassification approach based on Multiplicative Signal Correction (MSC). The MSC approach was applied on human and the bovine knee cartilage broadband Fourier Transform Infrared (FTIR) spectra and on a sparse data subset comprising of only seven wavelengths. The goal of the preclassification was to separate spectra with analyte-rich signals (i.e., cartilage) from spectra with analyte-poor (and high-matrix) signals (i.e., water). The human datasets 1 and 2 contained 814 and 815 spectra, while the bovine dataset contained 396 spectra. A pure water spectrum was used as a reference spectrum in the MSC approach. A threshold for the root mean square error (RMSE) was used to separate cartilage from water spectra for broadband and the sparse spectral data. Additionally, standard noise-to-ratio and principle component analysis were applied on broadband spectra. The fully automated MSC preclassification approach, using water as reference spectrum, performed as well as the manual visual inspection. Moreover, it enabled not only separation of cartilage from water spectra in broadband spectral datasets, but also in sparse datasets where manual visual inspection cannot be applied.  相似文献   

17.
Since the academic year 2001–2002, inter-laboratory trials for students of Analytical Chemistry in Spanish Universities have been organised by the Department of Analytical Chemistry at the University of Barcelona in collaboration with the Complutense University of Madrid, the University of Cordoba and the University of Huelva. The aim of these exercises is to train students in the use of tools for the assessment and improvement of quality in analytical laboratories.Representative samples of environmental and food analysis, agricultural soils and a type of beer were selected. The ethanol content of the beer and the pH, conductivity, and extractable phosphorus and potassium content in the soil were the chosen analytical parameters.Sample preparation, homogeneity and stability studies, as well as the statistical treatment of data from participants, were carried out by the laboratory Mat Control of the Department of Analytical Chemistry of the University of Barcelona.The paper presented heregives the results obtained after two years of experience.Presented at BERM-9—Ninth International Symposium on Biological and Environmental Reference Materials, June 15–19, 2003, Berlin, Germany.  相似文献   

18.
To develop thermal stable flavor, two glycosidic bound flavor precursors, geranyl-tetraacetyl-β-D-glucopyranoside (GLY-A) and geranyl-β-D-glucopyranoside (GLY-B) were synthesized by the modified Koenigs–Knorr reaction. The thermal decomposition process and pyrolysis products of the two glycosides were extensively investigated by thermogravimetry (TG), differential scanning calorimeter (DSC) and on-line pyrolysis-gas chromatography mass spectroscopy (Py-GC-MS). TG showed the T p of GLY-A and GLY-B were 254.6 and 275.7°C. The T peak of GLY-A and GLY-B measured by DSC were 254.8 and 262.1°C respectively. Py-GC-MS was used for the simply qualitative analysis of the pyrolysis products at 300 and 400°C. The results indicated that: 1) A large amount of geraniol and few by-products were produced at 300°C, the by-products were significantly increased at 400°C; 2) The characteristic pyrolysis product was geraniol; 3) The primary decomposition reaction was the cleavage of O-glycosidic bound of the two glycosides flavor precursors. The study on the thermal behavior and pyrolysis products of the two glycosides showed that this kind of flavor precursors could be used for providing the foodstuff with specific flavor during heating process.  相似文献   

19.
20.
王彦恺  郭长艳 《化学教育》2022,43(18):87-91
分析化学实验是化学和化工专业的核心基础课之一,其中数据分析处理是分析化学实验课程的重难点。基于此,以铁、铋的络合滴定实验数据处理为例,讨论了如何运用Python处理分析化学实验数据。将编程好的Python程序直接用于实验数据处理,能够快速、准确、高效地得到物质含量及相对平均偏差等数值,并仅需通过对关键代码的修改就能将程序扩展到其他实验应用中。Python程序应用于分析化学数据处理中能为实验的高效进行提供切实的保障,并具有较好的普适性和推广性,同时能够激发学生的探究兴趣和培养学生的创新能力。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号