首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
High-throughput DNA microarray provides an effective approach to the monitoring of expression levels of thousands of genes in a sample simultaneously. One promising application of this technology is the molecular diagnostics of cancer, e.g. to distinguish normal tissue from tumor or to classify tumors into different types or subtypes. One problem arising from the use of microarray data is how to analyze the high-dimensional gene expression data, typically with thousands of variables (genes) and much fewer observations (samples). There is a need to develop reliable classification methods to make full use of microarray data and to evaluate accurately the predictive ability and reliability of such derived models. In this paper, discriminant partial least squares was used to classify the different types of human tumors using four microarray datasets and showed good prediction performance. Four different cross-validation procedures (leave-one-out versus leave-half-out; incomplete versus full) were used to evaluate the classification model. Our results indicate that discriminant partial least squares using leave-half-out cross-validation provides a more realistic estimate of the predictive ability of a classification model, which may be overestimated by some of the cross-validation procedures, and the information obtained from different cross-validation procedures can be used to evaluate the reliability of the classification model.  相似文献   

2.
Study of the action of flavonoids on xanthine-oxidase by molecular topology   总被引:1,自引:0,他引:1  
A study was performed on xanthine-oxidase inhibition by 22 flavonoids, including flavones, flavonols, flavanones, and chalcones, using UV spectroscopy for experimental data and molecular topology to establish the structure-activity relationship (SAR) model. The flavonoids were classified into four groups according to their activity on xanthine-oxidase (inactive, low, significant, or high), and linear discriminant analysis was used to classify each compound within a group. The results led to a very good model, which was able to classify correctly as xanthine oxidase inhibitors, along with a test set of molecules including a variety of different compounds such as allopurinol, caffeic acid, esculetin, and alloxantin.  相似文献   

3.
The European food legislation authorizes the use of certain health claims based on a scientific basis. This study aimed to evaluate the fatty acid, tocopherol, and polar phenol composition of virgin olive oil (VOO) from cv. Chondrolia Chalkidikis and Chalkidiki regarding the fulfillment of official requirements for the health claims of ‘oleic acid’, ‘vitamin E’, and ‘olive oil polyphenols’. The examination of representative industrial VOOs from 15 olive mills of the Chalkidiki regional unit showed that the two cultivars yield oils contained the necessary concentrations of the responsible bioactive compounds. This evidence was further substantiated by a four harvest study whereby olives from different maturity stages were sampled from three olive groves. Oils were extracted at a laboratory scale and examined for their content in the above-mentioned three categories of constituents. Oils produced at industrial scale from olives harvested on the ‘technological optimum’ stage according to the olive grove proprietor were also analyzed. Extra virgin olive oil of the studied cultivars can safely bear the generic claims for ‘oleic acid’ and ‘vitamin E’. The cultivars present great potential regarding the total hydroxytyrosol and tyrosol content of the extracted oil required to attain the third health claim that may be influenced negatively by manufacturing practices.  相似文献   

4.
Classical multivariate analysis techniques such as factor analysis and stepwise linear discriminant analysis and artificial neural networks method (ANN) have been applied to the classification of Spanish denomination of origin (DO) rose wines according to their geographical origin. Seventy commercial rose wines from four different Spanish DO (Ribera del Duero, Rioja, Valdepeñas and La Mancha) and two successive vintages were studied. Nineteen different variables were measured in these wines. The stepwise linear discriminant analyses (SLDA) model selected 10 variables obtaining a global percentage of correct classification of 98.8% and of global prediction of 97.3%. The ANN model selected seven variables, five of which were also selected by the SLDA model, and it gave a 100% of correct classification for training and prediction. So, both models can be considered satisfactory and acceptable, being the selected variables useful to classify and differentiate these wines by their origin. Furthermore, the casual index analysis gave information that can be easily explained from an enological point of view.  相似文献   

5.
Near-infrared (NIR) transflectance spectra in the region of 1100-2500 nm were measured for 100 Thai fish sauces. Quantitative analyses of total nitrogen (TN) content, pH, refractive index, density and brix in the Thai fish sauces and their qualitative analyses were carried out by multivariate analyses with the aid of wavelength interval selection method named searching combination moving window partial least squares (SCMWPLS). The optimized informative region for TN selected by SCMWPLS was the region of 2264-2428 nm. A PLS calibration model, which used this region, yielded the lowest root mean square error of prediction (RMSEP) of 0.100% w/v for the PLS factor of 5. This prediction result is significantly better than those obtained by using the whole spectral region or informative regions selected by moving window partial least squares regression (MWPLSR). As for pH, density, refractive index and brix, the 1698-1722, and 2222-2258 nm regions, the 1358-1438 nm region, the 1774-1846, and 2078-2114 nm regions, and the 1322-1442, and 2000-2076 nm regions were selected by SCMWPLS as the optimized regions. The best prediction results were always obtained by use of the optimized regions selected by SCMWPLS. The lowest RMSEP for pH, density, refractive index and brix were 0.170, 0.007 g cm(-3), 0.0079 and 0.435 degrees Brix, respectively. Qualitative models were developed by using four supervised pattern recognitions, linear discriminant analysis (LDA), factor analysis-linear discriminant analysis (FA-LDA), soft independent modeling of class analog (SIMCA), and K neareat neighbors (KNN) for the optimized combination of informative regions of the NIR spectra of fish sauces to classify fish sauces into three groups based on TN. All the developed models can potentially classify the fish sauces with the correct classification rate of more than 82%, and the KNN classified model has the highest correct classification rate (95%). The present study has demonstrated that NIR spectroscopy combined with SCMWPLS is powerful for both the quantitative and qualitative analyses of Thai fish sauces.  相似文献   

6.
The use of chiral amino acids content and stepwise discriminant analysis to classify three types of commercial orange juices (i.e., nectars, orange juices reconstituted from concentrates, and pasteurized orange juices not from concentrates) is presented. Micellar electrokinetic chromatography with laser-induced fluorescence (MEKC-LIF) and beta-cyclodextrins are used to determine L- and D-amino acids previously derivatized with fluorescein isothiocyanate (FITC). This chiral MEKC-LIF procedure is easy to implement and provides information about the main amino acids content in orange juices (i.e., L-proline; L-aspartic acid, D-Asp, L-serine, L-asparagine, L-glutamic acid, D-Glu, L-alanine, L-.arginine, D-Arg, and the non-chiral gamma-amino-n-butyric acid (GABA), i.e., gamma-aminobutyric acid). From these results, it is clearly demonstrated that some D-amino acids occur naturally in orange juices. Application of stepwise discriminant analysis to 26 standard samples showed that the amino acids L-Arg, L-Asp and GABA were the most important variables to differentiate the three groups of samples. With these three selected amino acids a 100% correct classification of the samples was obtained either by standard or by leave-one-out cross-validation procedures. These classification functions based on the content in L-Arg, L-Asp and GABA were also applied to nine test samples and provided an adequate classification and/or interesting information on these samples. It is concluded that chiral MEKC-LIF analysis of amino acids and stepwise discriminant analysis can be used as a consistent procedure to classify commercial orange juices providing useful information about their quality and processing. To our knowledge, this is the first report about the combined use of chiral capillary electrophoresis and discriminant techniques to classify foods.  相似文献   

7.
The origin of medieval glass artefacts is studied by using a supervised learning technique, which is shown to be helpful when samples cannot be identified by typical design and appearance. A set of seventy pieces of glass was analyzed for ten trace elements by optical emission spectrography. The data matrix of 33 known objects from five origins was evaluated by multivariate variance and discriminant analysis in a training step. The extracted non-elementary discriminant functions were used to classify the 37 unidentified samples. The classification result is discussed in terms of its cultural/historical information content.  相似文献   

8.
Salvia miltiorrhiza, also known as Danshen, is a widely used traditional Chinese medicine for the treatment of cardiovascular diseases and hematological abnormalities. The root and rhizome of Salvia przewalskii and Salvia yunnanensis have been found as substitutes for Salvia miltiorrhiza in the market. In this study, the chemical information of 14 major compounds in Salvia miltiorrhiza and its substitutes were determined using a high‐performance liquid chromatography method. Stepwise discriminant analysis was adopted to select the characteristic variables. Partial least squares discriminant and hierarchical cluster analysis were performed to classify Salvia miltiorrhiza and its substitutes. The results showed that all of the samples were correctly classified both in partial least squares discriminant analysis and hierarchical cluster analysis based on the four compounds (caffeic acid, rosmarinic acid, salvianolic acid B, and salvianolic acid A). This method can not only distinguish Salvia miltiorrhiza and its substitutes, but also classify Salvia przewalskii and Salvia yunnanensis. The method can be applied for the quality assessment of Salvia miltiorrhiza and identification of unknown samples.  相似文献   

9.
Raman spectroscopy is recognized as a tool for chemometric analysis of biological materials due to the high information content relating to specific physical and chemical qualities of the sample. Thirty cells belonging to two different prostatic cell lines, PNT1A (immortalized normal prostate cell line) and LNCaP (malignant cell line derived from prostate metastases), were mapped using Raman microscopy. A range of spectral preprocessing methods (partial least-squares discriminant analyses (PLSDAs), principal component analyses (PCAs), and adjacent band ratios (ABRs)) were compared for input into linear discriminant analysis to model and classify the two cell lines. PLSDA and ABR were able to correctly classify 100% of cells into benign and malignant groups, while PLSDA correctly classified a greater proportion of individual spectra. PCA was used to image the distribution of various biochemicals inside each cell and confirm differences in composition/distribution between benign and malignant cell lines. This study has demonstrated that PLSDAs and ABRs of Raman data can identify subtle differences between benign and malignant prostatic cells in vitro.  相似文献   

10.
Fourier transform (FT) Raman spectrometry in combination with partial least squares (PLS) regression was used for direct, reagent-free determination of free fatty acid (FFA) content in olive oils and olives. Oils were directly investigated in a simple flow cell. Milled olives were measured in a dedicated sample cup, which was rotated eccentrically to the horizontal laser beam during spectrum acquisition in order to compensate sample heterogeneity. Both external and internal (leave-one-out) validation were used to assess the predictive ability of the PLS calibration models for FFA content (in terms of oleic acid) in oil and olives in the range 0.20-6.14 and 0.15-3.79%, respectively. The root mean square error of prediction (RMSEP) was 0.29% for oil and 0.28% for olives. The predicted FFA contents were used to classify oils and olives in different categories according to the European Union regulations. Ninety percent of the oil samples and 80% of the olives were correctly classified. These results demonstrate that the proposed procedures can be used for screening of good quality olives before processing, as well as, for the on-line control of the produced oil.  相似文献   

11.
Gas chromatography and pattern recognition methods were used to develop a potential method for differentiating European honeybees from Africanized honeybees. The test data consisted of 237 gas chromatograms of hydrocarbon extracts obtained from the wax glands, cuticle, and exocrine glands of European and Africanized honeybees. Each gas chromatogram contained 65 peaks corresponding to a set of standardized retention time windows. A genetic algorithm (GA) for pattern recognition was used to identify features in the gas chromatograms characteristic of the genotype. The pattern recognition GA searched for features in the chromatograms that optimized the separation of the European and Africanized honeybees in a plot of the two or three largest principal components of the data. Because the largest principal components capture the bulk of the variance in the data, the peaks identified by the pattern recognition GA primarily contained information about differences between gas chromatograms of European and Africanized honeybees. The principal component analysis routine embedded in the fitness function of the pattern recognition GA acted as an information filter, significantly reducing the size of the search space since it restricted the search to feature sets whose principal component plots showed clustering on the basis of the bees' genotype. In addition, the algorithm focused on those classes and/or samples that were difficult to classify as it trained using a form of boosting. Samples that consistently classify correctly are not as heavily weighted as samples that are difficult to classify. Over time, the algorithm learns its optimal parameters in a manner similar to a neural network. The pattern recognition GA integrates aspects of artificial intelligence and evolutionary computations to yield a "smart" one-pass procedure for feature selection and classification.  相似文献   

12.
结构描述符正交化及典型相关分析在饱和醇、醚质谱分类中的应用;饱和醇醚;模式识别;质谱分类变量;块变量;典型相关分析  相似文献   

13.
《Analytical letters》2012,45(12):2023-2034
Flos Chrysanthemum is a generic name for a particular group of edible plants, which also have medicinal properties. There are, in fact, twenty to thirty different cultivars, which are commonly used in beverages and for medicinal purposes. In this work, four Flos Chrysanthemum cultivars, Hangju, Taiju, Gongju, and Boju, were collected and chromatographic fingerprints were used to distinguish and assess these cultivars for quality control purposes. Chromatography fingerprints contain chemical information but also often have baseline drifts and peak shifts, which complicate data processing, and adaptive iteratively reweighted, penalized least squares, and correlation optimized warping were applied to correct the fingerprint peaks. The adjusted data were submitted to unsupervised and supervised pattern recognition methods. Principal component analysis was used to qualitatively differentiate the Flos Chrysanthemum cultivars. Partial least squares, continuum power regression, and K-nearest neighbors were used to predict the unknown samples. Finally, the elliptic joint confidence region method was used to evaluate the prediction ability of these models. The partial least squares and continuum power regression methods were shown to best represent the experimental results.  相似文献   

14.
In this work, a straightforward, reliable and effective automated method has been developed for the direct determination of monoaromatic volatile BTEXS group (namely benzene, toluene, ethylbenzene, o-, m- and p-xylenes, and styrene) in olives and olive oil, based on headspace technique. Separation, identification and quantitation were carried out by headspace-gas chromatography-mass spectrometry (HS-GC-MS) in selected ion monitoring (SIM) mode. Sample pretreatment or clean-up were not necessary (besides olives milling) because the olives and olive oil samples are put directly into an HS vial, automatically processed by HS and then injected in the GC-MS for chromatographic analysis. The chemical and instrumental variables were optimized using spiked olives and olive oil samples at 50 μg kg−1 of each targeted species. The method was validated to ensure the quality of the results. The precision was satisfactory with relative standard deviations (RSD (%)) in the range 1.6-5.2% and 10.3-14.2% for olive oil and olives, respectively. Limits of detection were in the range 0.1-7.4 and 0.4-4.4 μg kg−1 for olive oil and olives, respectively. Finally, the proposed method was applied to the analysis of real olives and olive oil samples, finding positives of the studied compounds, with overall BTEXS concentration levels in the range 23-332 μg kg−1 and 4.2-87 μg kg−1 for olive oil and olives, respectively.  相似文献   

15.
Spanish-style table olives are one of the most common processed foods in the Mediterranean countries. Lack of control during fermentation can lead to one of the main defects of the olive, called ‘Zapateria’, caused by the combination of volatile fatty acids reminiscent of rotten leather. In this study, table olives altered with ‘Zapateria’ defect were stuffed with a hydrocolloid flavoured with the aroma ‘Mojo picón’ to improve consumer acceptance. Sensory analysis, determination of volatile compounds and electronic nose (E-nose) were used to evaluate the quality of the olives. The control samples had a high concentration of the defect ‘Zapateria’ and were classified in the second commercial category, while higher ‘Mojo picón’ flavour concentrations resulted in these olives being classified as ‘extra category’ (a masking effect). The main volatile compounds in olives with ‘Zapateria’ defect were cyclohexanecarboxylic acid and pentanoic acid. E-nose allowed discrimination between stuffed olives without added flavouring and olives with ‘Mojo picón’ flavouring at different concentrations. Finally, PLS regression allowed a predictive linear model to be established between E-nose and sensory analysis values. The RP2 values were 0.74 for perceived defect and 0.86 for perceived aroma. The E-nose was successfully applied for the first time to classify Spanish-style table olives with ‘Zapateria’ defect intensity and with the addition of the ‘Mojo picón’ aroma masking the defect.  相似文献   

16.
In this study the effective discrimination of extra virgin olive oils is described using HPLC-MS, combined with chemometric evaluation. The presented method is simple since the diluted oil sample is directly injected into the system, without any preliminary chemical derivatization or purification step. Separation of diacylglycerols, triacylglycerols and sterols occurs within 20 min and is achieved using an octadecyl-silica column. Detection is performed by positive APCI mass spectrometry which provided sensitivity to detect over 50 compounds in the sample. After extraction of data, stepwise discriminant function analysis is used to select the variables with the highest discriminative power. These variables are used to perform linear discriminant analysis and classify/predict the samples. One-hundred per cent classification and 99% prediction rate was achieved for olive oils obtained from Nocellara, Biancolilla and Cerausola cultivars. Reliability of prediction was tested by cross validation.  相似文献   

17.
A laser-induced fluorescence (LIF) system was optimized using a solution of Micrococcus luteus in ethanol/water 50% (v/v) to obtain spectra in the gas phase of 46 bioaerosols. Experimental designs such as Plackett-Burman and factorial design were applied. The fluorescence spectra were treated chemometrically by principal component analysis, linear discriminant analysis and hierarchical cluster analysis to classify the microorganisms according to family, morphology and gram. The best results were obtained using LDA. The method was applied to air samples and the LIF results allowed to characterize bioaerosols reliability. The robustness of the technique was demonstrated by the identification of many bacteria.  相似文献   

18.
The mineral particles are classified in different textural classes according to their size. Reflectance spectrometry and spectra can be valid instruments to classify the soils according to their texture. This is possible using different statistical methods, for example, discriminant analysis. However, other multivariate methods, like multinomial logistic regression, can be used, but the presence of multicollinearity among explicative variables could affect the estimation of the parameters. The solution proposed to remedy this problem is an alternative way to apply the multinomial logit model. To evaluate its performances, we compare the results with both classical multinomial logit and discriminant analysis ones. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

19.
Fourier transform infrared spectroscopy (FTIR) in connection with chemometric analysis were used as a fast and direct approach to classify spray dried honey powder compositions in terms of honey content, the type of diluent (water or skim milk), and carrier (maltodextrin or skim milk powder) used for the preparation of feed solutions before spray drying. Eleven variants of honey powders containing different amounts of honey, the type of carrier, and the diluent were investigated and compared to pure honey and carrier materials. Chemometric discrimination of samples was achieved by principal component analysis (PCA), hierarchical clustering analysis (HCA), linear discriminant analysis (LDA), and partial least squares-discriminant analysis (PLS-DA) modelling procedures performed on the FTIR preprocessed spectral data for the fingerprint region (1800–750 cm−1) and the extended region (3600–750 cm−1). As a result, it was noticed that the type of carrier is a significant factor during the classification of different samples of powdered multifloral honey. PCA divided the samples based on the type of carrier, and additionally among maltodextrin-honey powders it was possible to distinguish the type of diluent. The result obtained by PCA-LDA and PLS-DA scores yielded a clear separation between four classes of samples and showed a very good discrimination between the different honey powder with a 100.0% correct overall classification rate of the samples.  相似文献   

20.
Fisher's linear discriminant analysis is one of the most commonly used and studied classification methods in chemometrics. The method finds a projection of multivariate data into a lower dimensional space so that the groups in the data are well separated. The resulting projected values are subsequently used to classify unlabeled observations into the groups. A semi‐supervised version of Fisher's linear discriminant analysis is developed so that the unlabeled observations are also used in the model‐fitting procedure. This approach is advantageous when few labeled and many unlabeled observations are available. The semi‐supervised linear discriminant analysis method is demonstrated on a number of data sets where it is shown to yield better separation of the groups and improved classification over Fisher's linear discriminant analysis. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号