首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The antifungal activity of 14 anthracene-based synthetic dyes and 6 reference compounds was measured on 36 fungal strains and the data matrix was evaluated separately by principal component analysis (PCA) and using a spectral mapping technique (SPM). The dimensionality of the maps of principal component loadings and variables and the selectivity maps was reduced to two by non-linear mapping. Except for two compounds, the dyes showed marked antifungal activity. Calculations proved that both the strength and selectivity of the biological effect of anthracene-based dyes were highly dependent on the chemical structure of the dye and on the type of fungi. PCA and SPM revealed different aspects of the antifungal activity, therefore, their simultaneous application in future quantitative structure–activity relationship studies is highly recommended.  相似文献   

2.
The classification of cancer is a major research topic in bioinformatics. The nature of high dimensionality and small size associated with gene expression data,however,makes the classification quite challenging. Although principal component analysis (PCA) is of particular interest for the high-dimensional data,it may overemphasize some aspects and ignore some other important information contained in the richly complex data,because it displays only the difference in the first twoor three-dimensional PC subsp...  相似文献   

3.
张进  姜红  徐雪芳 《分析试验室》2022,41(2):158-162
提出了一种基于显微共聚焦拉曼光谱技术的肉毒梭菌快速鉴别方法.利用共聚焦显微拉曼光谱技术(CRM)采集了肉毒梭菌、艰难梭菌和产气荚膜梭菌的拉曼光谱,比较了3种梭菌的平均拉曼光谱,采用基线校正、标准正态变换、Savitzky-Golay 5点平滑和最大最小值归一化预处理后,借助主成分分析(PCA)降维并提取特征变量,对样本...  相似文献   

4.
基于主成分分析和小波神经网络的近红外多组分建模研究   总被引:5,自引:0,他引:5  
将小麦叶片原始光谱经过预处理后,采用主成分分析(PCA)对数据进行降维,取前3个主成分输入小波神经网络,建立了基于主成分分析和小波神经网络的近红外多组分预测模型(WNN);进一步研究了小波基函数个数的选取(WNN隐层节点数)对小波神经网络模型性能的影响,并将WNN模型与偏最小二乘法(PLS)和传统的反向传播神经网络(BPNN)模型进行了比较.结果表明,所建立的WNN模型能用于同时预测小麦叶片全氮和可溶性总糖两种组分含量,其预测均方根误差(RMSEP)分别为0.101%和0.089%,预测相关系数(R)分别为0.980和0.967.另外,在收敛速度和预测精度上,WNN模型明显优于BPNN和PLS模型,从而为将小波神经网络用于近红外光谱的多组分定量分析奠定了基础.  相似文献   

5.
The application of a new method to the multivariate analysis of incomplete data sets is described. The new method, called maximum likelihood principal component analysis (MLPCA), is analogous to conventional principal component analysis (PCA), but incorporates measurement error variance information in the decomposition of multivariate data. Missing measurements can be handled in a reliable and simple manner by assigning large measurement uncertainties to them. The problem of missing data is pervasive in chemistry, and MLPCA is applied to three sets of experimental data to illustrate its utility. For exploratory data analysis, a data set from the analysis of archeological artifacts is used to show that the principal components extracted by MLPCA retain much of the original information even when a significant number of measurements are missing. Maximum likelihood projections of censored data can often preserve original clusters among the samples and can, through the propagation of error, indicate which samples are likely to be projected erroneously. To demonstrate its utility in modeling applications, MLPCA is also applied in the development of a model for chromatographic retention based on a data set which is only 80% complete. MLPCA can predict missing values and assign error estimates to these points. Finally, the problem of calibration transfer between instruments can be regarded as a missing data problem in which entire spectra are missing on the ‘slave’ instrument. Using NIR spectra obtained from two instruments, it is shown that spectra on the slave instrument can be predicted from a small subset of calibration transfer samples even if a different wavelength range is employed. Concentration prediction errors obtained by this approach were comparable to cross-validation errors obtained for the slave instrument when all spectra were available.  相似文献   

6.
In some applications of diffuse reflectance spectroscopy there may be substantial variability between the spectra from replicate measurements of what is nominally the same sample. A method called error reduction by orthogonal subtraction (EROS) is proposed to ameliorate the effects of this. The first step is to use principal component analysis (PCA) to identify the structure in the variability of replicate measurements. This is followed by subtraction of the modelled effects from the original spectral data matrix X by projection onto the subspace orthogonal to factors derived from the PCA. An application to the clinical diagnosis of colon lesions is presented, in which pre‐treatment of spectra using the proposed method is successful in reducing the complexity and increasing both the accuracy and interpretability of the subsequent classification model. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

7.
The use of the theory of splines to approximate the potential energy surface in molecular dynamics is examined. It is envisaged that such an approximation should be able to accurately capture the potentials’ behavior and be computationally cost effective, both for one-dimensional and $n$ -dimensional problems with $n$ arbitrary. In this regard, the problem of dimensionality is pinpointed, with shape-preserving splines emerging as a viable alternative for fitting surfaces in multidimensional spaces. An algorithm is also presented to allow the use of non-uniform meshes with high accuracy fitting and less interpolation points.  相似文献   

8.
This paper focuses on the application of principal component analysis (PCA) to facilitate the optimization of the derivatization of oestrogenic steroids—estrone, 17β‐estradiol, estriol, 17α‐ethinylestradiol and diethylstilbestrol—in order to achieve (1) the complete derivatization of all the hydroxyl groups contained in the structure of the compounds and (2) the greatest effectiveness of this reaction. Six different derivatization reagents were used in this study, whereas 2‐methyl‐anthracene was applied as the internal standard to evaluate the effectiveness of the reactions. The experimental data were subjected to PCA. With PCA, the dimensionality of the original multivariable data set could be reduced and the selection of optimum conditions for derivatization facilitated. The mixture of 99% N,O‐bis(trimethylsilyl)trifluoroacetamide + 1% trimethylchlorosilane and pyridine (1:1, v/v) at 60 °C for 30 min has been established as the most convenient and efficient means of derivatizing the aforementioned oestrogenic steroids and diethylstilbestrol; the N‐methyl‐N‐(trimethylsilyl)trifluoroacetamide + pyridine (1:1, v/v) mixture seems to be a promising alternative. The application of PCA for optimizing the derivatization procedure, proposed for the first time in this study, is particularly useful in the development of multicomponent methods across several chemical classes of compounds. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

9.
《Analytical letters》2012,45(7):713-724
Abstract

Two different sets of data have been subjected to distortion by induced systematic errors of types that are common in analytical chemistry. By means of eigenvector projections and a disjoint principal components analysis it is demonstrated that even gross systematic errors do not significantly influence the classification of the samples.  相似文献   

10.
为了实现扫描仪在不同光源、不同观察者条件下准确获取颜色信息,最大程度的避免同色异谱现象,本文采用光谱的方法对扫描仪进行特性化处理,通过多项式回归和BP神经网络分别与主成分分析法结合,首先对检测样本的光谱反射率进行主成分分析,提取主成分与主成分系数,通过实验得到主成分系数与多项式回归、BP神经网络结构之间的转换模型,实现了扫描仪低维RGB信号对原始光谱反射率信息的重构,进而实现扫描仪的光谱特性化.实验结果表明,多项式项数为19项时,达到训练样本的均方根误差为1.7%,检测样本的均方根误差为1.9%.而包含15个隐层节点的单隐层BP神经网络结构为比较合理的网络结构,达到训练样本的均方根误差为1.3%,检测样本的均方根误差为1.5%.对彩色扫描仪的特征化处理,采用多项式回归法得到光谱特性化精度较低,采用BP神经网络模型能够实现更高的光谱特性化精度.  相似文献   

11.
In this paper, we propose a new method for clustering of chemical databases based on the representation of measurements of structural similarity onto multidimensional spaces. The proposed method permits the tuning of the clustering process through the selection of the dimension of the projection space, the normal vectors and the sensibility of the projection process. The structural similarity of each element regarding to the database elements is projected onto the defined spaces generating clusters that represent the characteristics and diversity of the database and whose size and characteristics can be easily adjusted.  相似文献   

12.
The aim of this work was to study the thermal decomposition of different plant species obtained from energy plantations. Thermogravimetry/ mass spectrometry (TG/MS) experiments have been performed with two herbaceous crops (Miscanthus sinensis, pelletized energy grass) and two wood samples (willow, water locust) in inert and oxidative atmospheres. Owing to the large number of data obtained in the experiments, a chemometric tool, principal component analysis (PCA) has been used to help the interpretation of the results. It has been found that the thermal decomposition of the studied wood species is similar, whereas that of the studied herbaceous samples exhibits significant differences. PCA has been found to be useful for finding correlations between the various experimental data.  相似文献   

13.
《Analytical letters》2012,45(2):290-307
Abstract

Distinguishing chemicals and improvement on analytical methods has a direct impact on modern chemical analysis. In this work, the dissociative ionization of xylene isomers was investigated using a femtosecond laser mass spectrometry (FLMS) method with a custom-built linear time-of-flight (TOF) instrument. Laser beams at 800?nm and 400?nm were used and intensity-dependent analysis of the obtained mass spectra was performed using principal component analysis (PCA) to distinguish the xylene isomers, which give identical mass spectra in appearance that cannot be distinguished using normal mass spectrometry methods. The results show that there is a statistically highly significant difference between the xylene isomers for two principal components (1 ? α?>?99.99%) and minimal information loss (<5%) took place during the PCA procedure. Also, the use of the k-medoid clustering method showed that the isomers may be distinguished in real-time for a wide range of ionization laser pulse powers with approximately 99% accuracy. The results suggest that real-time isomer analysis by the FLMS method is suitable for mass spectral identification applications. The FLMS method has been shown to be an important alternative to other mass spectrometric methods that use different ionization mechanisms.  相似文献   

14.
A journey into low-dimensional spaces with autoassociative neural networks   总被引:4,自引:0,他引:4  
Daszykowski M  Walczak B  Massart DL 《Talanta》2003,59(6):1095-1105
The compression and the visualization of the data have been always a subject of a great deal of excitement. Since multidimensional data sets are difficult to interpret and visualize, much of the attention is drawn how to compress them efficiently. Usually, the compression of dimensionality is considered as the first step of exploratory data analysis. Here, we focus our attention on autoassociative neural networks (ANNs), which in a very elegant manner provide data compression and visualization. ANNs can deal with linear and nonlinear correlation among variables, what makes them a very powerful tool in exploratory data analysis. In the literature, ANNs are often referred as nonlinear principal component analysis (PCA), and due to their specific structure they are also known as bottleneck neural networks. In this paper, ANNs are discussed in details. Different training modes are described and illustrated on real example. The usefulness of ANNs for nonlinear data compression and visualization purposes is proven with the aid of chemical data sets, being the subject of analysis. The comparison of ANNs with well-known PCA is also presented.  相似文献   

15.
16.
Determining the rank of a chemical matrix is the first step in many multivariate, chemometric studies. Rank is defined as the minimum number of linearly independent factors after deletion of factors that contribute to random, nonlinear, uncorrelated errors. Adding a matrix of rank 1 to a data matrix not only increases the rank by one unit but also perturbs the primary factor axes, having little effect on the secondary axes associated with the random errors in the measurements. The primary rank of a data matrix can be determined by comparing the residual variances obtained from principal component analysis (PCA) of the original data matrix to those obtained from an augmented matrix. The ratio of the residual variances between adjacent factor levels represents a Fisher ratio that can be used to distinguish the primary factors (chemical as well as instrumental factors) from the secondary factors (experimental errors). The results gleaned from model studies as well as those from experimental studies are used to illustrate the efficacy of the proposed methodology. The method is independent of the nature of the error distribution. Limitations and precautions are discussed. An algorithm, written in MATLAB format, is included. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

17.
ANOVA–simultaneous component analysis (ASCA) is a recently developed tool to analyze multivariate data. In this paper, we enhance the explorative capability of ASCA by introducing a projection of the observations on the principal component subspace to visualize the variation among the measurements. We compare the significance of experimental effects for ASCA and ANOVA–principal component analysis (PCA), a similar tool to explore multivariate data, by using permutation tests. Furthermore, we quantify the quality of the loadings estimate obtained with ASCA and compare this with the loadings estimate obtained with ANOVA–PCA. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

18.
Statistical techniques, when applied to data obtained by chemical investigations on ancient artworks, are usually expected to recognize groups of objects to classify the archeological finds, to attribute the provenance of items compared with earlier investigated ones, or to determine whether an archaelogical attribution is possible or not. The statistical technique most frequently used in archeometry is the principal component analysis (PCA), because of its simplicity in theory and implementation. However, the application of PCA to archeometric data showed severe limitations because of its linear feature. Indeed, PCA is inadequate to classify data whose behavior describe a curve or a curved subspace of the original data space. As a consequence of it, an amount of information is lost because the multi‐dimensional data space is compressed into a lower‐dimensional subspace including principal components. The aim of this work is then to test a novel statistical technique for archeometry. We propose a nonlinear PCA method to extract maximum chemical information by plotting data on the smallest number of principal components and to answer archeological questions. The higher accuracy and effectiveness of nonlinear PCA approach with respect to standard PCA for the analysis of archeometric data are shown through the study of Apulian red figured pottery (fifth–fourth century BC) coming from some of the most relevant archeological sites of ancient Apulia (Monte Sannace (Gioia del Colle), Egnatia (Fasano), Canosa, Altamura, Conversano, and Arpi(Foggia)). Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

19.
一个基于诊断的稳健主成分分析方法   总被引:1,自引:0,他引:1  
经典的主成分分析方法易受异常点影响。本文根据该方法的特点,提出一新的诊断方法,将多变量数据中异常剔除后再进行主成分分析,构成有效的稳健主成分分析法。用此法处理二组实际数据,结果令人满意。  相似文献   

20.
Principal component analysis (PCA) was used to extract the number of factors which can describe the 737 gas-liquid partition coefficients of five linear, four branched, and two cyclic alkanes in 67 common solvents. Based on the reconstruction of partition coefficient data matrix, we concluded that the experimental dataset could readily be reduced to two relevant factors. Using only these two factors, there were no errors larger than 3%, 7 cases had errors larger than 2%, and in 34 cases, errors were between 1 and 2%. n-Hexane and ethylcyclohexane were chosen as the test factors, and all other partition coefficients were expressed in terms of these two test factors. Prediction of the logarithmic partition coefficient of these alkanes in seven chemically different solvents, which were originally excluded from the data matrix, was excellent: the root mean square error was 0.064, only in 11 cases the errors were larger than 1%, and only 3 had errors larger than 4%.Linear solvation energy relationships (LSERs) using both theoretical and empirical solvent parameters were used to explain the molecular interactions responsible for partition. Several combinations of parameters were tried but the standard deviations were not less than 0.31. This could be attributed to the model itself, imprecisions in the data matrix or in some of the LSER parameters. Solvent cohesive parameters and surface tension in combination with polarity-polarizability or dispersion parameters perform the best.Finally, the two principal component factors were rotated onto the most relevant physicochemical parameters that control the gas-liquid partitioning phenomena.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号