共查询到20条相似文献,搜索用时 15 毫秒
1.
Veras G Gomes Ade A da Silva AC de Brito AL de Almeida PB de Medeiros EP 《Talanta》2010,83(2):565-568
This article describes the classification of biodiesel samples using NIR spectroscopy and chemometric techniques. A total of 108 spectra of biodiesel samples were taken (being three samples each of four types of oil, cottonseed, sunflower, soybean and canola), from nine manufacturers. The measurements for each of the three samples were in the spectral region between 12,500 and 4000 cm−1. The data were preprocessed by selecting a spectral range of 5000-4500 cm−1, and then a Savitzky-Golay second-order polynomial was used with 21 data points to obtain second derivative spectra. Characterization of the biodiesel was done using chemometric models based on hierarchical cluster analysis (HCA), principal component analysis (PCA) and soft independent modeling of class analogy (SIMCA) elaborated for each group of biodiesel samples (cotton, sunflower, soybean and canola). For the HCA and PCA, the formation of clusters for each group of biodiesel was observed, and SIMCA models were built using 18 spectral measurements for each type of biodiesel (training set), and nine spectral measurements to construct a classification set (except for the canola oil which used eight spectra). The SIMCA classifications obtained 100% accurate identifications. Using this strategy, it was feasible to classify biodiesel quickly and nondestructively without the need for various analytical determinations. 相似文献
2.
The fingerprinting capacity of thin layer chromatography (TLC) and image analysis in the case of propolis samples collected in different area in Romania has been investigated. Fuzzy divisive hierarchical clustering approach was used as a powerful tool of samples discrimination and fingerprinting according to the geographical origin and local flora. The fuzzy partition and patterns obtained by membership degrees plot were in a very good agreement with floral origin and geographic location of Romanian propolis samples, and clearly illustrate the fuzziness concerning their similarities and difference. The results obtained strongly support that TLC via image analysis can be successfully employed in the fingerprinting methodologies if they are combined with appropriate fuzzy clustering method. The method developed in this paper might be also extended in the authenticity and origin control of fruits, herbs or derived products. 相似文献
3.
Gas chromatographic profiles have been generated from different batches of a polypropylene/polyethylene copolymer. The profiles generated from polymer pellets have been obtained by dynamic headspace/capillary gas chromatography analysis. Initially 10 to 40 peaks were chosen at random from the quantitative reports and transferred to a data table. After appropriate scaling the table has been analyzed by a multivariate statistic program, SIMCA (Soft Independent Modeling of Class Analogy) a pattern recognition technique. The method has been used to differentiate batches according to sensory qualities of the final packaging product and changes in polymer peilet production. 相似文献
4.
Metabonomic profiling using proton nuclear magnetic resonance (1H NMR) spectroscopy and multivariate data analysis of human serum samples was used to characterize metabolic profiles in renal cell carcinoma (RCC). We found distinct, easily detectable differences between (a) RCC patients and healthy humans, (b) RCC patients with metastases and without metastases, and (c) RCC patients before and after nephrectomy. Compared to healthy human serum, RCC serum had higher levels of lipid (mainly very low-density lipoproteins), isoleucine, leucine, lactate, alanine, N-acetylglycoproteins, pyruvate, glycerol, and unsaturated lipid, together with lower levels of acetoacetate, glutamine, phosphatidylcholine/choline, trimethylamine-N-oxide, and glucose. This pattern was somewhat reversed after nephrectomy. Altered metabolite concentrations are most likely the result of the cells switching to glycolysis to maintain energy homeostasis following the loss of ATP caused by impaired TCA cycle in RCC. Serum NMR spectra combined with principal component analysis techniques offer an efficient, convenient way of depicting tumour biochemistry and stratifying tumours under different pathophysiological conditions. It may be able to assist early diagnosis and postoperative surveillance of human malignant diseases using single blood samples. 相似文献
5.
6.
Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance/correlation matrix of the analyzed data. However, to properly work with high-dimensional data sets, PCA poses severe mathematical constraints on the minimum number of different replicates, or samples, that must be included in the analysis. Generally, improper sampling is due to a small number of data respect to the number of the degrees of freedom that characterize the ensemble. In the field of life sciences it is often important to have an algorithm that can accept poorly dimensioned data sets, including degenerated ones. Here a new random projection algorithm is proposed, in which a random symmetric matrix surrogates the covariance/correlation matrix of PCA, while maintaining the data clustering capacity. We demonstrate that what is important for clustering efficiency of PCA is not the exact form of the covariance/correlation matrix, but simply its symmetry. 相似文献
7.
Dmitrii N. Rassokhin Victor S. Lobanov Dimitris K. Agrafiotis 《Journal of computational chemistry》2001,22(4):373-386
Producing good low‐dimensional representations of high‐dimensional data is a common and important task in many data mining applications. Two methods that have been particularly useful in this regard are multidimensional scaling and nonlinear mapping. These methods attempt to visualize a set of objects described by means of a dissimilarity or distance matrix on a low‐dimensional display plane in a way that preserves the proximities of the objects to whatever extent is possible. Unfortunately, most known algorithms are of quadratic order, and their use has been limited to relatively small data sets. We recently demonstrated that nonlinear maps derived from a small random sample of a large data set exhibit the same structure and characteristics as that of the entire collection, and that this structure can be easily extracted by a neural network, making possible the scaling of data set orders of magnitude larger than those accessible with conventional methodologies. Here, we present a variant of this algorithm based on local learning. The method employs a fuzzy clustering methodology to partition the data space into a set of Voronoi polyhedra, and uses a separate neural network to perform the nonlinear mapping within each cell. We find that this local approach offers a number of advantages, and produces maps that are virtually indistinguishable from those derived with conventional algorithms. These advantages are discussed using examples from the fields of combinatorial chemistry and optical character recognition. © 2001 John Wiley & Sons, Inc. J Comput Chem 22: 373–386, 2001 相似文献
8.
Summary A pattern recognition methodology has been developed for analysis of chromatographic data. The method uses a new class of
multidimensional orthogonal polynomials developed by Cohen in conjunction with a supervised learning technique. The method
is applicable to any chromatographic data for which classification into two or more categories is desired. The algorithm analyzes
both elution times and peak areas. An application is shown for the analysis of organic acids in ascitic fluid obtained from
patients with liver disorders. Classification of these patients for presence or absence of bacterial infection shows over
ninety percent correct classification. 相似文献
9.
This paper develops a multi-parturition genetic algorithm (MPGA) to be used in geometrical bounding of the overlapped clusters in a data set for the classification of chemical data. Two new operators have been introduced to modify the conventional genetic algorithm, namely, multi-parturition and decimation and orientated creation to improve the linear classification results and diminish the computational time. To circumvent the difficulty commonly encountered in the treatment of linearly inseparable chemical data sets, the optimized linear classifier is further modified to provide a complementary nonlinear classifier. For this reason the space regions of the overlapped clusters have been bounded by erection of half-hyperellipsoids over the linearly misclassified patterns. The proposed MPGA was applied to classify a number of chemical and other data sets with a dimension from 4 to 14. Experimental results have indicated that the proposed MPGA could classify seriously overlapped data sets with an acceptable error rate. 相似文献
10.
Hyperspectral imaging (HSI) is a method for exploring spatial and spectral information associated with the distribution of the different compounds in a chemical or biological sample. Amongst the multivariate image analysis tools utilized to decompose the raw data into a bilinear model, multivariate curve resolution alternating least squares (MCR‐ALS) can be applied to obtain the distribution maps and pure spectra of the components of the sample image. However, a requirement is to have the data in a two‐way matrix. Thus, a preliminary step consists of unfolding the raw HSI data into a single‐pixel direction. Consequently, through this data manipulation, the information regarding pixel neighboring is lost, and spatial information cannot directly be constrained on the component profiles in the current MCR‐ALS algorithm. In this short communication, we propose an adaptation of the MCR‐ALS framework, enabling the potential implementation of any variation of spatial constraint. This can be achieved by adding, at each least‐squares step, refolding/unfolding of the distribution maps for the components. The implementation of segmentation, shape smoothness, and image modeling as spatial constraints is proposed as a proof of concept. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
11.
A novel approach for CE data analysis based on pattern recognition techniques in the wavelet domain is presented. Low-resolution, denoised electropherograms are obtained by applying several preprocessing algorithms including denoising, baseline correction, and detection of the region of interest in the wavelet domain. The resultant signals are mapped into character sequences using first derivative information and multilevel peak height quantization. Next, a local alignment algorithm is applied on the coded sequences for peak pattern recognition. We also propose 2-D and 3-D representations of the found patterns for fast visual evaluation of the variability of chemical substances concentration in the analyzed samples. The proposed approach is tested on the analysis of intracerebral microdialysate data obtained by CE and LIF detection, achieving a correct detection rate of about 85% with a processing time of less than 0.3 s per 25,000-point electropherogram. Using a local alignment algorithm on low-resolution denoised electropherograms might have a great impact on high-throughput CE since the proposed methodology will substitute automatic fast pattern recognition analysis for slow, human based time-consuming visual pattern recognition methods. 相似文献
12.
Yuta Yokoyama Tomoko Kawashima Mayumi Ohkawa Hideo Iwai Satoka Aoyagi 《Surface and interface analysis : SIA》2015,47(4):439-446
Time‐of‐flight secondary ion mass spectrometry (ToF‐SIMS) is a powerful tool for determining surface information of complex systems such as polymers and biological materials. However, the interpretation of ToF‐SIMS raw data is often difficult. Multivariate analysis has become effective methods for the interpretation of ToF‐SIMS data. Some of multivariate analysis methods such as principal component analysis and multivariate curve resolution are useful for simplifying ToF‐SIMS data consisting of many components to that explained by a smaller number of components. In this study, the ToF‐SIMS data of four layers of three polymers was analyzed using these analysis methods. The information acquired by using each method was compared in terms of the spatial distribution of the polymers and identification. Moreover, in order to investigate the influence of surface contamination, the ToF‐SIMS data before and after Ar cluster ion beam sputtering was compared. As a result, materials in the sample of multiple components, including unknown contaminants, were distinguished. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
13.
Agnieszka Smolinska Lionel Blanchet Lutgarde M.C. Buydens Sybren S. Wijmenga 《Analytica chimica acta》2012
Metabolomics is the discipline where endogenous and exogenous metabolites are assessed, identified and quantified in different biological samples. Metabolites are crucial components of biological system and highly informative about its functional state, due to their closeness to functional endpoints and to the organism's phenotypes. Nuclear Magnetic Resonance (NMR) spectroscopy, next to Mass Spectrometry (MS), is one of the main metabolomics analytical platforms. The technological developments in the field of NMR spectroscopy have enabled the identification and quantitative measurement of the many metabolites in a single sample of biofluids in a non-targeted and non-destructive manner. Combination of NMR spectra of biofluids and pattern recognition methods has driven forward the application of metabolomics in the field of biomarker discovery. The importance of metabolomics in diagnostics, e.g. in identifying biomarkers or defining pathological status, has been growing exponentially as evidenced by the number of published papers. In this review, we describe the developments in data acquisition and multivariate analysis of NMR-based metabolomics data, with particular emphasis on the metabolomics of Cerebrospinal Fluid (CSF) and biomarker discovery in Multiple Sclerosis (MScl). 相似文献
14.
G. Reich 《Chromatographia》1987,24(1):659-665
Summary The application of a newly developed peak recognition algorithm is shown. This algorith is based on the KNN method, one of
the pattern recognition methods. It is shown that peaks with a S/N-ratio down to one can be safely recognized. This is also
possible if the baseline has not only detector noise, but has other disturbances, e.g., noise signals which are generated
by a reaction detector. The recognition ability of the algorithm is demonstrated by a standard chromatogram with three different
concentrations and with two different sampling rates. The improvement against the classical algorithm is demonstrated. Some
properties of the algorithm are discussed. 相似文献
15.
R. A. Shaw J. R. Mansfield S. P. Rempel S. Low-Ying V. V. Kupriyanov H. H. Mantsch 《Journal of Molecular Structure》2000,500(1-3):129-138
While it is now clear that both infrared spectroscopy and spectroscopic imaging can play roles in providing medically relevant information, the raw spectral or imaging measurement seldom reveals directly the property of clinical interest (i.e. is this tissue cancerous? What is the blood glucose concentration? Is tissue perfusion adequate?) Instead, pattern recognition algorithms, clustering methods, regression, and other theoretical methods provide the means to distill diagnostic information from the original measurements. This article discusses the role of these approaches in the discovery of diagnostically relevant spectral and spatial patterns. 相似文献
16.
核磁共振技术结合模式识别对中药马兜铃酸亚急性生化效应的研究 总被引:3,自引:0,他引:3
利用核磁共振技术结合模式识别方法,研究了马兜铃酸(AristolochicAcid,AA)的亚急性生化效应.大鼠连续5日腹腔注射马兜铃酸后,不同时间段尿液1HNMR谱显示与肾小管及肾乳头受损相关的标记物(NMRmarker)浓度有显著变化;铬酸钠、氯化汞、二溴乙胺、盐酸肼和异硫氰酸-α-萘酯,并利用主成分分析法对造成的肝肾损伤模型组、AA组和对照组的大鼠尿液1HNMR谱解析和分类.1HNMR谱中各种代谢物的谱峰强度变化及主成分分析结果均显示,马兜铃酸引起的肾损伤与肾小管及肾乳头损伤模型类似,且随给药量的积累,肾损伤范围扩大程度加深,引起肾脏不可逆损伤.该方法可用于中药的毒理学研究. 相似文献
17.
Recently we have proposed a new variable selection algorithm, based on clustering of variable concept (CLoVA) in classification problem. With the same idea, this new concept has been applied to a regression problem and then the obtained results have been compared with conventional variable selection strategies for PLS. The basic idea behind the clustering of variable is that, the instrument channels are clustered into different clusters via clustering algorithms. Then, the spectral data of each cluster are subjected to PLS regression. Different real data sets (Cargill corn, Biscuit dough, ACE QSAR, Soy, and Tablet) have been used to evaluate the influence of the clustering of variables on the prediction performances of PLS. Almost in the all cases, the statistical parameter especially in prediction error shows the superiority of CLoVA-PLS respect to other variable selection strategies. Finally the synergy clustering of variable (sCLoVA-PLS), which is used the combination of cluster, has been proposed as an efficient and modification of CLoVA algorithm. The obtained statistical parameter indicates that variable clustering can split useful part from redundant ones, and then based on informative cluster; stable model can be reached. 相似文献
18.
A novel strategy of data analysis for artificial taste and odour systems is presented in this work. It is demonstrated that using a supervised method also in feature extraction phase enhances fruit juice classification capability of sensor array developed at Warsaw University of Technology. Comparison of direct processing (raw data processed by Artificial Neural Network (ANN), raw data processed by Partial Least Squares-Discriminant Analysis (PLS-DA)) and two-stage processing (Principal Components Analysis (PCA) outputs processed by ANN, PLS-DA outputs processed by ANN) is presented. It is shown that considerable increase of classification capability occurred in the case of the new method proposed by the authors. 相似文献
19.
Analysis and modeling of spatial data are of considerable interest in many applications. However, the prediction of geographical features from a set of chemical measurements on a set of geographically distinct samples has never been explored. We report a new, tree‐structured hierarchical model for the estimation of geographical location of spatially distributed samples from their chemical measurements. The tree‐structured hierarchical modeling used in this study involves a set of geographic regions stored in a hierarchical tree structure, with each nonterminal node representing a classifier and each terminal node representing a regression model. Once the tree‐structured model is constructed, given a sample with only chemical measurements available, the predicted regional location of the sample is gradually restricted as it is passed through a series of classification steps. The geographic location of the sample can be predicted using a regression model within the terminal subregion. We show that the tree‐structured modeling approach provides reasonable estimates of geographical region and geographic location for surface water samples taken across the entire USA. Further, the location uncertainty, an estimate of a probability that a test sample could be located within a pre‐estimated, joint prediction interval that is much smaller than the terminal subregion, can also be assessed. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
20.
Most methods of crystal structure prediction generate many trial structures. Because these may differ in choice of unit cell, it is not always immediately obvious whether or not two such structures are equivalent. A method to answer this question is described for the case where the asymmetric unit contains one molecule in a general position, defined by the rotation and translation of that molecule with respect to some reference geometry. In the comparison of two structures, the rotation needed to transform one orientation into the other is determined first. Then it is checked whether this rotation corresponds to a transformation that is compatible with the imposed space group symmetry. A final test compares the cell lengths, the cell angles, and the molecular centers of gravity after the transformation of one structure into the other. The method is implemented for triclinic, monoclinic, and orthorhombic systems and is found to be very fast in tests on hypothetical crystal structures of acetic acid. © 1997 John Wiley & Sons, Inc. J Comput Chem 18 :1036–1042, 1997 相似文献