首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The application of supervised pattern recognition methodology is becoming important within chemistry. The aim of the study is to compare classification method accuracies by the use of a McNemar’s statistical test. Three qualitative parameters of sugar beet are studied: disease resistance (DR), geographical origins and crop periods. Samples are analyzed by near-infrared spectroscopy (NIRS) and by wet chemical analysis (WCA). Firstly, the performances of eight well-known classification methods on NIRS data are compared: Linear Discriminant Analysis (LDA), K-Nearest Neighbors (KNN) method, Soft Independent Modeling of Class Analogy (SIMCA), Discriminant Partial Least Squares (DPLS), Procrustes Discriminant Analysis (PDA), Classification And Regression Tree (CART), Probabilistic Neural Network (PNN) and Learning Vector Quantization (LVQ) neural network are computed. Among the three data sets, SIMCA, DPLS and PDA have the highest classification accuracies. LDA and KNN are not significantly different. The non-linear neural methods give the less accurate results. The three most accurate methods are linear, non-parametric and based on modeling methods. Secondly, we want to emphasize the power of near-infrared reflectance data for sample discrimination. McNemar’s tests compare classification developed with WCA or with NIRS data. For two of the three data sets, the classification results are significantly improved by the use of NIRS data.  相似文献   

3.
This work describes multi-classification based on binary probabilistic discriminant partial least squares (p-DPLS) models, developed with the strategy one-against-one and the principle of winner-takes-all. The multi-classification problem is split into binary classification problems with p-DPLS models. The results of these models are combined to obtain the final classification result. The classification criterion uses the specific characteristics of an object (position in the multivariate space and prediction uncertainty) to estimate the reliability of the classification, so that the object is assigned to the class with the highest reliability. This new methodology is tested with the well-known Iris data set and a data set of Italian olive oils. When compared with CART and SIMCA, the proposed method has better average performance of classification, besides giving a statistic that evaluates the reliability of classification. For the olive oil set the average percentage of correct classification for the training set was close to 84% with p-DPLS against 75% with CART and 100% with SIMCA, while for the test set the average was close to 94% with p-DPLS as against 50% with CART and 62% with SIMCA.  相似文献   

4.
A new, rapid analytical method using near-infrared spectroscopy (NIRS) was developed to differentiate two species of Radix puerariae (GG), Pueraria lobata (YG) and Pueraria thomsonii (FG), and to determine the contents of puerarin, daidzin and total isoflavonoid in the samples. Five isoflavonoids, puerarin, daidzin, daidzein, genistin and genistein were analyzed simultaneously by high-performance liquid chromatography-diode array detection (HPLC-DAD). The total isoflavonoid content was exploited as critical parameter for successful discrimination of the two species. Scattering effect and baseline shift in the NIR spectra were corrected and the spectral features were enhanced by several pre-processing methods. By using linear discriminant analysis (LDA) and soft independent modeling class analogy (SIMCA), samples were separated successfully into two different clusters corresponding to the two GG species. Furthermore, sensitivity and specificity of the classification models were determined to evaluate the performance. Finally, partial least squares (PLS) regression was used to build the correlation models. The results showed that the correlation coefficients of the prediction models are R = 0.970 for the puerarin, R = 0.939 for daidzin and R = 0.969 for total isoflavonoid. The outcome showed that NIRS can serve as routine screening in the quality control of Chinese herbal medicine (CHM).  相似文献   

5.
This paper proposes a methodology for cigarette classification employing Near Infrared Reflectance spectrometry and variable selection. For this purpose, the Successive Projections Algorithm (SPA) is employed to choose an appropriate subset of wavenumbers for a Linear Discriminant Analysis (LDA) model. The proposed methodology is applied to a set of 210 cigarettes of four different brands. For comparison, Soft Independent Modelling of Class Analogy (SIMCA) is also employed for full-spectrum classification. The resulting SPA-LDA model successfully classified all test samples with respect to their brands using only two wavenumbers (5058 and 4903 cm−1). In contrast, the SIMCA models were not able to achieve 100% of classification accuracy, regardless of the significance level adopted for the F-test. The results obtained in this investigation suggest that the proposed methodology is a promising alternative for assessment of cigarette authenticity.  相似文献   

6.
Some Mallotus species are commonly used as traditional medicine (TM) ingredients in Vietnam and China, but only a few are studied for their activities. In Part I, high-performance liquid chromatography (HPLC) fingerprints of 39 Mallotus samples (17 species) were developed and, because of the complexity of and the large differences between the samples, it was chosen to analyse the unaligned fingerprints. The peaks, potentially responsible for the antioxidant activity in given Mallotus species, were indicated by the regression coefficients from an orthogonal projections to latent structures (O-PLS) model. In the present study, an in depth discussion on the need for alignment of the Mallotus fingerprints for the indication of the potentially active compounds is made, as well as an experimental analysis and identification of the previously indicated peaks by HPLC–mass spectrometry (HPLC–MS). Additionally, to thoroughly study and discuss the alignment problem, the modelling and prediction of the antioxidant activity of green tea samples based on HPLC fingerprints were also considered.  相似文献   

7.
Nobuki Kato 《Tetrahedron》2006,62(31):7307-7318
We report the synthesis of fluorescence-labeled probes based on phyllanthurinolactone 1, which is a leaf-closing substance of Phyllanthus urinaria L. The fluorescence study using biologically active probe 2 and inactive probes (epi-2 and 31) revealed that the target cell for 1 is a motor cell and suggested that some receptors, which recognize the aglycon of 1 exist on the plasma membrane of the motor cell, as with leaf-opening substances. Moreover, binding of probe 2 was specific to the plant motor cell contained in the plants belonging to the genus Phyllanthus. These results showed that the binding of probe 2 with a motor cell is specific to the plant genus and suggested that the genus-specific receptor for the leaf-closing substance would be involved in nyctinasty.  相似文献   

8.
The knowledge of lipid components of wheat finds a precious information in order to differentiate between Triticum durum (TD) and Triticum aestivum (TA). The determination of the percentages of methyl esters of the differently unsaturated fatty acids with 18-carbon atoms (C18), of sterol fraction and of the other components is of particular weight. In this paper, the classification methods of linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA) were applied in order to measure the classification and prediction abilities of the determined (percentages of the) components of the lipid fraction of wheat in differentiating among species, origins, varieties and crops. By univariate feature selection method (Fisher weights (FW)) and linear discriminant analysis, it was found that the only oleate is able to distinguish between the two species with a prediction rate of 100%. Inside the species Triticum durum, it was obtained a prediction rate of 83.9% while discriminating between the different origins, a prediction rate of 82.2% while discriminating among varieties and a prediction rate of 94.3% among crop years.  相似文献   

9.
ASTM clustering for improving coal analysis by near-infrared spectroscopy   总被引:1,自引:0,他引:1  
Andrés JM  Bona MT 《Talanta》2006,70(4):711-719
Multivariate analysis techniques have been applied to near-infrared (NIR) spectra coals to investigate the relationship between nine coal properties (moisture (%), ash (%), volatile matter (%), fixed carbon (%), heating value (kcal/kg), carbon (%), hydrogen (%), nitrogen (%) and sulphur (%)) and the corresponding predictor variables. In this work, a whole set of coal samples was grouped into six more homogeneous clusters following the ASTM reference method for classification prior to the application of calibration methods to each coal set. The results obtained showed a considerable improvement of the error determination compared with the calibration for the whole sample set. For some groups, the established calibrations approached the quality required by the ASTM/ISO norms for laboratory analysis. To predict property values for a new coal sample it is necessary the assignation of that sample to its respective group. Thus, the discrimination and classification ability of coal samples by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS) in the NIR range was also studied by applying Soft Independent Modelling of Class Analogy (SIMCA) and Linear Discriminant Analysis (LDA) techniques. Modelling of the groups by SIMCA led to overlapping models that cannot discriminate for unique classification. On the other hand, the application of Linear Discriminant Analysis improved the classification of the samples but not enough to be satisfactory for every group considered.  相似文献   

10.
Rhodiola, especially Rhodiola crenulate and Rhodiola rosea, is an increasingly widely used traditional medicine or dietary supplement in Asian and western countries. Because of the phytochemical diversity and difference of therapeutic efficacy among Rhodiola species, it is crucial to accurately identify them. In this study, a simple and efficient method of the classification of Rhodiola crenulate, Rhodiola rosea, and their confusable species (Rhodiola serrata, Rhodiola yunnanensis, Rhodiola kirilowii and Rhodiola fastigiate) was established by UHPLC fingerprints combined with chemical pattern recognition analysis. The results showed that similarity analysis and principal component analysis (PCA) could not achieve accurate classification among the six Rhodiola species. Linear discriminant analysis (LDA) combined with stepwise feature selection exhibited effective discrimination. Seven characteristic peaks that are responsible for accurate classification were selected, and their distinguishing ability was successfully verified by partial least-squares discriminant analysis (PLS-DA) and orthogonal partial least-squares discriminant analysis (OPLS-DA), respectively. Finally, the components of these seven characteristic peaks were identified as 1-(2-Hydroxy-2-methylbutanoate) β-D-glucopyranose, 4-O-glucosyl-p-coumaric acid, salidroside, epigallocatechin, 1,2,3,4,6-pentagalloyglucose, epigallocatechin gallate, and (+)-isolarisiresinol-4′-O-β-D-glucopyranoside or (+)-isolarisiresinol-4-O-β-D-glucopyranoside, respectively. The results obtained in our study provided useful information for authenticity identification and classification of Rhodiola species.  相似文献   

11.
In this work, a new approach is proposed to verify the differentiating characteristics of five bacteria (Escherichia coli, Enterococcus faecalis, Streptococcus salivarius, Streptococcus oralis, and Staphylococcus aureus) by using digital images obtained with a simple webcam and variable selection by the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA). In this sense, color histograms in the red–green–blue (RGB), hue-saturation-value (HSV), and grayscale channels and their combinations were used as input data, and statistically evaluated by using different multivariate classifiers (Soft Independent Modeling by Class Analogy (SIMCA), Principal Component Analysis-Linear Discriminant Analysis (PCA-LDA), Partial Least Squares Discriminant Analysis (PLS-DA) and Successive Projections Algorithm-Linear Discriminant Analysis (SPA-LDA)). The bacteria strains were cultivated in a nutritive blood agar base layer for 24 h by following the Brazilian Pharmacopoeia, maintaining the status of cell growth and the nature of nutrient solutions under the same conditions. The best result in classification was obtained by using RGB and SPA-LDA, which reached 94 and 100 % of classification accuracy in the training and test sets, respectively. This result is extremely positive from the viewpoint of routine clinical analyses, because it avoids bacterial identification based on phenotypic identification of the causative organism using Gram staining, culture, and biochemical proofs. Therefore, the proposed method presents inherent advantages, promoting a simpler, faster, and low-cost alternative for bacterial identification. Figure
Summary of the new proposed methodology for bacteria classification by using color histograms and SPA-LDA  相似文献   

12.
Fourier transform infrared spectroscopy (FTIR) is a nondestructive, simple, rapid, and cheap measurement technique for analysis of many multicomponent chemical systems, e.g., detection of adulterants in food samples. In this respect, this study proposes combining FTIR spectroscopy with multivariate classification methods for classification and discrimination of different samples of infant formulas adulterated by melamine or/and cyanuric acid. Different parametric and non-parametric multivariate classification methods including the linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), soft independent modeling of class analogy (SIMCA), K-nearest neighbors (KNN), and classification and regression tree (CART) approaches were used to classify the recorded FTIR data. Assessing the performance of the multivariate methods according to their sensitivity, specificity and percent of correct prediction results demonstrated that coupling FTIR spectroscopy with multivariate classification can be applied as a rapid and powerful technique to the simultaneous detection of melamine and cyanuric acid in powdered infant formulas. This combinatorial method is efficient for adulterant concentrations as low as 0.0001 w/w%.  相似文献   

13.
This study compares results obtained with several chemometric methods: SIMCA, PLS2-DA, PLS2-DA with SIMCA, and PLS1-DA in two infrared spectroscopic applications. The results were optimized by selecting spectral ranges containing discriminant information. In the first application, mid-infrared spectra of crude petroleum oils were classified according to their geographical origins. In the second application, near-infrared spectra of French virgin olive oils were classified in five registered designations of origins (RDOs). The PLS-DA discrimination was better than SIMCA in classification performance for both applications. In both cases, the PLS1-DA classifications give 100% good results. The encountered difficulties with SIMCA analyses were explained by the criteria of spectral variance. As a matter of fact, when the ratio between inter-spectral variance and intra-spectral variance was close to the Fc (Fisher criterion) threshold, SIMCA analysis gave poor results. The discrimination power of the variable range selection procedure was estimated from the number of correctly classified samples.  相似文献   

14.
Some Mallotus species are used in traditional medicine in Vietnam and China. Some also show interesting activities, such as antioxidant and cytotoxic ones. Combining fingerprint technology with data-handling techniques allows indicating the peaks potentially responsible for given activities. In this study it is aspired to indicate from chromatographic fingerprints the peaks potentially responsible for the antioxidant activity of several Mallotus species. Relevant information was extracted using linear multivariate calibration techniques, both before and after alignment of the fingerprints with correlation optimized warping (COW). From the studied techniques, Stepwise Multiple Linear Regression is least recommended as it made an inadequate variable selection. Principal Component Regression theoretically can take largely varying variables uncorrelated to the antioxidant activity into account. However, in practice in the actual case study this problem was limited. These problems in principle do not occur using Partial Least Squares (PLS) models. Of the tested PLS methods, Orthogonal Projections to Latent Structures was preferred because of its simplicity, reproducibility, reduced model complexity and improved interpretability of the regression coefficients, yielding a clearer view on the individual contribution of the compounds. Furthermore, reducing analysis times from 60 min to 35 and 22.5 min resulted in the same main compounds, indicated responsible for the antioxidant activity. Models built after alignment by COW did not result in additional information.  相似文献   

15.
Inductively coupled plasma-mass spectrometry (ICP-MS) in combination with different supervised chemometric approaches has been used to classify cultivated mussels in Galicia (Northwest of Spain) under the European Protected Designation of Origin (PDO). 158 mussel samples, collected in the five rías on the basis of the production, along with minor and trace elements, including high field strength elements (HFSEs) and rare earth elements (REEs), were used with this aim. The classification of samples was achieved according to their origin: Galician vs. other regions (from Tarragona, Spain, and Ethang de Thau, France) and between the Galician Rías. The ability of linear discriminant analysis (LDA), soft independent modelling of class analogy (SIMCA) and artificial neural network (ANN) to classify the samples was investigated. Correct assignations for Galician and non-Galician samples were obtained when LDA and SIMCA were used. ANNs were more effective when a classification according to the ría of origin was to be applied.  相似文献   

16.
In multivariate regression and classification issues variable selection is an important procedure used to select an optimal subset of variables with the aim of producing more parsimonious and eventually more predictive models. Variable selection is often necessary when dealing with methodologies that produce thousands of variables, such as Quantitative Structure-Activity Relationships (QSARs) and highly dimensional analytical procedures.In this paper a novel method for variable selection for classification purposes is introduced. This method exploits the recently proposed Canonical Measure of Correlation between two sets of variables (CMC index). The CMC index is in this case calculated for two specific sets of variables, the former being comprised of the independent variables and the latter of the unfolded class matrix. The CMC values, calculated by considering one variable at a time, can be sorted and a ranking of the variables on the basis of their class discrimination capabilities results. Alternatively, CMC index can be calculated for all the possible combinations of variables and the variable subset with the maximal CMC can be selected, but this procedure is computationally more demanding and classification performance of the selected subset is not always the best one.The effectiveness of the CMC index in selecting variables with discriminative ability was compared with that of other well-known strategies for variable selection, such as the Wilks’ Lambda, the VIP index based on the Partial Least Squares-Discriminant Analysis, and the selection provided by classification trees.A variable Forward Selection based on the CMC index was finally used in conjunction of Linear Discriminant Analysis. This approach was tested on several chemical data sets. Obtained results were encouraging.  相似文献   

17.
We propose a very simple and fast method for detecting Sudan dyes (I, II, III and IV) in commercial spices, based on characterizing samples through their UV-visible spectra and using multivariate classification techniques to establish classification rules. We applied three classification techniques: K-Nearest Neighbour (KNN), Soft Independent Modelling of Class Analogy (SIMCA) and Partial Least Squares Discriminant Analysis (PLS-DA). A total of 27 commercial spice samples (turmeric, curry, hot paprika and mild paprika) were analysed by chromatography (HPLC-DAD) to check that they were free of Sudan dyes. These samples were then spiked with Sudan dyes (I, II, III and IV) up to a concentration of 5 mg L−1. Our final data set consisted of 135 samples distributed in five classes: samples without Sudan dyes, samples spiked with Sudan I, samples spiked with Sudan II, samples spiked with Sudan III and samples spiked with Sudan IV.Classification results were good and satisfactory using the classification techniques mentioned above: 99.3%, 96.3% and 90.4% of correct classification with PLS-DA, KNN and SIMCA, respectively. It should be pointed out that with SIMCA, there are no real classification errors as no samples were assigned to the wrong class: they were just not assigned to any of the pre-defined classes.  相似文献   

18.
Because of its eminent high resolution potential and minimal solvent consumption, pressurized capillary electrochromatography (pCEC) may offer an interesting alternative to HPLC for screening applications that need to resolve complex samples. In this paper, its potential was assessed in a screening of plant extracts from Mallotus species to indicate compounds with possible antioxidant activities by means of a PLS model built from their pCEC fingerprints. The main aim of this research was to find out whether pCEC can have an added value for this application. To get a complete overview of the techniques potential for this application, it was also assessed whether the technique can meet the requirements in terms of precision, sensitivity and column robustness. Encountered benefits and downsides were reported. Fingerprints with satisfactory sensitivity and precision could be obtained by concentrating the sample 5-fold and using optimized rinsing procedures, respectively. From the generated pCEC fingerprints of 39 Mallotus samples and their respective DPPH radical scavenging activity test results, a three-component PLS model was being built. The model proved good predictive abilities and easily allowed the indication of possible antioxidant compounds in the fingerprints. Despite its much higher peak capacity, the performance of pCEC to fingerprint the majority of the Mallotus extracts did not surpass that of a custom HPLC method. This was also reflected in its comparable power to indicate possible antioxidant compounds in the fingerprints after modeling. Because of its low detection sensitivity and modest column robustness, the benefit of the lower solvent consumption was partly paid-off by the current need for more system maintenance, also limiting the sample throughput. For the considered screening application, pCEC may suit as a viable but no preferred alternative technique.  相似文献   

19.
The composition of volatile components of subcutaneous fat from Iberian pig has been studied. Purge and trap gas chromatography−mass spectrometry has been used. The composition of the volatile fraction of subcutaneous fat has been used for authentication purposes of different types of Iberian pig fat. Three types of this product have been considered, montanera, extensive cebo and intensive cebo. With classification purposes, several pattern recognition techniques have been applied. In order to find out possible tendencies in the sample distribution as well as the discriminant power of the variables, principal component analysis was applied as visualisation technique. Linear discriminant analysis (LDA) and soft independent modelling by class analogy (SIMCA) were used to obtain suitable classification models. LDA and SIMCA allowed the differentiation of three fattening diets by using the contents in 2,2,4,6,6-pentamethyl-heptane, m-xylene, 2,4-dimethyl-heptane, 6-methyl-tridecane, 1-methoxy-2-propanol, isopropyl alcohol, o-xylene, 3-ethyl-2,2-dimethyl-oxirane, 2,6-dimethyl-undecane, 3-methyl-3-pentanol and limonene.  相似文献   

20.
Taking in consideration the global analysis of complex samples, proposed by the metabolomic approach, the chromatographic fingerprint encompasses an attractive chemical characterization of herbal medicines. Thus, it can be used as a tool in quality control analysis of phytomedicines. The generated multivariate data are better evaluated by chemometric analyses, and they can be modeled by classification methods. “Stone breaker” is a popular Brazilian plant of Phyllanthus genus, used worldwide to treat renal calculus, hepatitis, and many other diseases. In this study, gradient elution at reversed-phase conditions with detection at ultraviolet region were used to obtain chemical profiles (fingerprints) of botanically identified samples of six Phyllanthus species. The obtained chromatograms, at 275 nm, were organized in data matrices, and the time shifts of peaks were adjusted using the Correlation Optimized Warping algorithm. Principal Component Analyses were performed to evaluate similarities among cultivated and uncultivated samples and the discrimination among the species and, after that, the samples were used to compose three classification models using Soft Independent Modeling of Class analogy, K-Nearest Neighbor, and Partial Least Squares for Discriminant Analysis. The ability of classification models were discussed after their successful application for authenticity evaluation of 25 commercial samples of “stone breaker.”  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号