期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Svante Wold Michael Sjöström Rolf Carlson Torbjörn Lundstedt Sven Hellberg Bert Skagerberg Conny Wikström Jerker Öhman 《Analytica chimica acta》1986

In multivariate data analysis such as principal components analysis (PCA) and projections to latent structures (PLS), it is essential that the training set systems (objects) are selected to provide data with substantial information for model parametrization, and to represent properly any future situations where the multilvariate model is used for predictions. In the framework of multivariate projections (PCA, SIMCA and PLS), elementary concepts of statistical design (fractional factorials and composite designs) can be used with the latent variables (PC or PLS scores) as design variables. The plan of action thus becomes: (1) problem formulation (specify aim and model, make a conceptual division of the investigated system into subsystems); (2) collection of multivariate data for each type of subsystems; (3) estimation of the practical dimensionality of the data for each type of subsystems by PC or PLS analysis; (4) use of the PC or PLS scores (t) as design variables in the combination of subsystems to systems in the training set; (5) measurement of responses (Y); (6) analysis of data by PCA or PLS; (7) interpretation of results with possible feedback to steps 1, 2 or 3. The procedures are illustrated by two problems: a structure/activity relationship for a family of peptides, and optimization of an organic synthesis with respect to system variables (solvent, substrate, co-reactant_) and process variables (temperature, reactant concentrations). 相似文献

2.

Application of multivariate data analysis methods to Comparative Molecular Field Analysis (CoMFA) data: Proton affinities and pKa prediction for nucleic acids components

Gargallo R Sotriffer CA Liedl KR Rode BM 《Journal of computer-aided molecular design》1999,13(6):611-623

Multivariate data analysis methods (Principal Component Analysis (PCA) and Partial Least Squares (PLS)) are applied to the analysis of the CoMFA (Comparative Molecular Field Analysis) data for several nucleic acids components. The data set includes nitrogenated bases, nucleosides, linear nucleotides, 3, 5-cyclic nucleotides and oligonucleotides. PCA is applied to study the structure of the CoMFA data and to detect possible outliers in the data set. PLS is applied to correlate the CoMFA data with either calculated AM1 proton affinities or with experimental pKa values. The possibility of making a prediction of pKa values directly from 3D structures of the monomers for polynucleotides is also shown. The influence of the superposition criteria and of conformational changes along the glycosidic bond on the pKa prediction are studied as well. 相似文献

3.

Partial least squares modeling of combined infrared, H NMR and C NMR spectra to predict long residue properties of crude oils

Peter de Peinder Tom Visser Derek D. Petrauskas Fabien Salvatori Fouad Soulimani Bert M. Weckhuysen 《Vibrational Spectroscopy》2009,51(2):205-212

Research has been carried out to determine the potential of partial least squares (PLS) modeling of mid-infrared (IR) spectra of crude oils combined with the corresponding ¹H and ¹³C nuclear magnetic resonance (NMR) data, to predict the long residue (LR) properties of these substances. The study elaborates further on a recently developed and patented method to predict this type of information from only IR spectra. In the present study, PLS modeling was carried out for 7 different LR properties, i.e., yield long-on-crude (YLC), density (D_LR), viscosity (V_LR), sulfur content (S), pour point (PP), asphaltenes (Asph) and carbon residue (CR). Research was based on the spectra of 48 crude oil samples of which 28 were used to build the PLS models and the remaining 20 for validation. For each property, PLS modeling was carried out on single type IR, ¹³C NMR and ¹H NMR spectra and on 3 sets of merged spectra, i.e., IR + ¹H NMR, IR + ¹³C NMR and IR + ¹H NMR + ¹³C NMR. The merged spectra were created by considering the NMR data as a scaled extension of the IR spectral region. In addition, PLS modeling of coupled spectra was performed after a Principal Component Analysis (PCA) of the IR, ¹³C NMR and ¹H NMR calibration sets. For these models, the 10 most relevant PCA scores of each set were concatenated and scaled prior to PLS modeling. The validation results of the individual IR models, expressed as root-mean-square-error-of-prediction (RMSEP) values, turned out to be slightly better than those obtained for the models using single input ¹³C NMR or ¹H NMR data. For the models based on IR spectra combined with NMR data, a significant improvement of the RMSEP values was not observed neither for the models based on merged spectra nor for those based on the PCA scores. It implies, that the commonly accepted complementary character of NMR and IR is, at least for the crude oil and bitumen samples under study, not reflected in the results of PLS modeling. Regarding these results, the absence of sample preparation and the straightforward way of data acquisition, IR spectroscopy is preferred over NMR for the prediction of LR properties of crude oils at site. 相似文献

4.

Two-dimensional quantitative structure–activity relationship study of 1,4-naphthoquinone derivatives tested against HL-60 human promyelocytic leukaemia cells

M. C. A. Costa M. M. C. Ferreira 《SAR and QSAR in environmental research》2017,28(4):325-339

相似文献

5.

Towards a complete identification of orthogonal variation in multiple regression from a PLS1 modeling point of view: including OPLS by a change of orthogonal basis

Ulf G. Indahl 《Journal of Chemometrics》2014,28(6):508-517

It is well known that the predictions of the single response orthogonal projections to latent structures (OPLS) and the single response partial least squares regression (PLS1) regression are identical in the single‐response case. The present paper presents an approach to identification of the complete y ‐orthogonal structure by starting from the viewpoint of standard PLS1 regression. Three alternative non‐deflating OPLS algorithms and a modified principal component analysis (PCA)‐driven method (including MATLAB code) is presented. The first algorithm implements a postprocessing routine of the standard PLS1 solution where QR factorization applied to a shifted version of the non‐orthogonal scores is the key to express the OPLS solution. The second algorithm finds the OPLS model directly by an iterative procedure. By a rigorous mathematical argument, we explain that orthogonal filtering is a ‘built‐in’ property of the traditional PLS1 regression coefficients. Consequently, the capabilities of OPLS with respect to improving the predictions (also for new samples) compared with PLS1 are non‐existing. The PCA‐driven method is based on the fact that truncating off one dimension from the row subspace of X results in a matrix X _orth with y ‐orthogonal columns and a rank of one less than the rank of X . The desired truncation corresponds exactly to the first X deflation step of Martens non‐orthogonal PLS algorithm. The significant y ‐orthogonal structure of X found by PCA of X _orth is split into two fundamental parts: one part that is significantly contributing to correct the first PLS score toward y and one part that is not. The third and final OPLS algorithm presented is a modification of Martens non‐orthogonal algorithm into an efficient dual PLS1–OPLS algorithm. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

6.

A Comparative QSRR Study on Enantioseparation of Ethanol Ester Enantiomers in HPLC Using Multivariate Image Analysis,Quantum Mechanical and Structural Descriptors

《中国化学会会志》2017,64(2):176-187

相似文献

7.

Quantitative structure-retention relationship study of α-, β₁-, and β₂-agonists using multiple linear regression and partial least-squares procedures

A.G Fragkaki C.G Georgakopoulos 《Analytica chimica acta》2004,512(1):165-171

相似文献

8.

Potential applications of functional data analysis in chemometrics

Wouter Saeys Bart De Ketelaere Paul Darius 《Journal of Chemometrics》2008,22(5):335-344

In spectroscopy the measured spectra are typically plotted as a function of the wavelength (or wavenumber), but analysed with multivariate data analysis techniques (multiple linear regression (MLR), principal components regression (PCR), partial least squares (PLS)) which consider the spectrum as a set of m different variables. From a physical point of view it could be more informative to describe the spectrum as a function rather than as a set of points, hereby taking into account the physical background of the spectrum, being a sum of absorption peaks for the different chemical components, where the absorbance at two wavelengths close to each other is highly correlated. In a first part of this contribution, a motivating example for this functional approach is given. In a second part, the potential of functional data analysis is discussed in the field of chemometrics and compared to the ubiquitous PLS regression technique using two practical data sets. It is shown that for spectral data, the use of B-splines proves to be an appealing basis to accurately describe the data. By applying both functional data analysis and PLS on the data sets the predictive ability of functional data analysis is found to be comparable to that of PLS. Moreover, many chemometric datasets have some specific structure (e.g. replicate measurements, on the same object or objects that are grouped), but the structure is often removed before analysis (e.g. by averaging the replicates). In the second part of this contribution, we suggest a method to adapt traditional analysis of variance (ANOVA) methods to datasets with spectroscopic data. In particular, the possibilities to explore and interpret sources of variation, such as variations in sample and ambient temperature, are examined. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献

9.

Partial least squares modeling of combined infrared, 1H NMR and 13C NMR spectra to predict long residue properties of crude oils

《Vibrational Spectroscopy》2010,52(2):205-212

Research has been carried out to determine the potential of partial least squares (PLS) modeling of mid-infrared (IR) spectra of crude oils combined with the corresponding ¹H and ¹³C nuclear magnetic resonance (NMR) data, to predict the long residue (LR) properties of these substances. The study elaborates further on a recently developed and patented method to predict this type of information from only IR spectra. In the present study, PLS modeling was carried out for 7 different LR properties, i.e., yield long-on-crude (YLC), density (D_LR), viscosity (V_LR), sulfur content (S), pour point (PP), asphaltenes (Asph) and carbon residue (CR). Research was based on the spectra of 48 crude oil samples of which 28 were used to build the PLS models and the remaining 20 for validation. For each property, PLS modeling was carried out on single type IR, ¹³C NMR and ¹H NMR spectra and on 3 sets of merged spectra, i.e., IR + ¹H NMR, IR + ¹³C NMR and IR + ¹H NMR + ¹³C NMR. The merged spectra were created by considering the NMR data as a scaled extension of the IR spectral region. In addition, PLS modeling of coupled spectra was performed after a Principal Component Analysis (PCA) of the IR, ¹³C NMR and ¹H NMR calibration sets. For these models, the 10 most relevant PCA scores of each set were concatenated and scaled prior to PLS modeling. The validation results of the individual IR models, expressed as root-mean-square-error-of-prediction (RMSEP) values, turned out to be slightly better than those obtained for the models using single input ¹³C NMR or ¹H NMR data. For the models based on IR spectra combined with NMR data, a significant improvement of the RMSEP values was not observed neither for the models based on merged spectra nor for those based on the PCA scores. It implies, that the commonly accepted complementary character of NMR and IR is, at least for the crude oil and bitumen samples under study, not reflected in the results of PLS modeling. Regarding these results, the absence of sample preparation and the straightforward way of data acquisition, IR spectroscopy is preferred over NMR for the prediction of LR properties of crude oils at site. 相似文献

10.

A partial least squares regression study with antioxidant flavonoid compounds

Karen C. Weber Káthia M. Honório Aline T. Bruni Adriano D. Andricopulo Albérico B. F. da Silva 《Structural chemistry》2006,17(3):307-313

The quantitative structure-activity relationship of a set of 19 flavonoid compounds presenting antioxidant activity was studied by means of PLS (Partial Least Squares) regression. The optimization of the structures and calculation of electronic properties were done by using the semiempirical method AM1. A reliable model (r ²=0.806 and q ²=0.730) was obtained and from this model it was possible to consider some aspects of the structure of the flavonoid compounds studied that are related with their free radical scavenging ability. The quality of the PLS model obtained in this work indicates that it can be used in order to design new flavonoid compounds that present ability to scavenge free radicals. 相似文献

11.

Chemometric estimation of the RP TLC retention behaviour of some estrane derivatives by using multivariate regression methods

Strahinja Z. Kovačević Lidija R. Jevrić Sanja O. Podunavac Kuzmanović Eva S. Lončar 《Central European Journal of Chemistry》2013,11(12):2031-2039

相似文献

12.

Application of Phytochemical and Elemental Profiling,Chemometric Multivariate Analyses,and Biological Activities for Characterization and Discrimination of Fruits of Four Garcinia Species

《Analytical letters》2012,45(1):122-139

Abstract

Species of Garcinia (Guttiferae) are used for flavoring curries, as a supplement, and to treat various diseases. This study describes the comparison and discrimination of Garcinia cambogia, Garcinia indica, Garcinia mangostana and Garcinia atroviridis fruits by analyzing their major phytochemicals, elemental content, antioxidant, antidiabetic, and anticholinesterase enzymes activities. For phytochemical and elemental profiling, ultraviolet (UV), near infrared/infrared (NIR/IR), inductively coupled plasma-optical emission spectroscopy (ICP-OES) and ICP-mass spectrometric (ICP-MS) techniques were used. The chemometric multivariate tests of linear discriminant and principal component analyses (LDA, PCA) were used to discriminate the subject fruit samples. Spectroscopic data showed resonances of phenolics and flavonoidal constituents present in the fruits. G. mangostana exhibited the highest phenolics (721.6 to 2815.3?µM GAE/g), whereas G. cambogia was rich in flavonoids (51.9 to 2709.2?µM QE/g). Anthocyanin (cyanidin-3-O-glucoside) evaluated by high performance liquid chromatographic was 9.01?mg/kg in G. mangostana fruit. In the analyzed fruits, Ca, K and Na were high, trace essential elements were at appreciable contents, whereas the toxic elements As, Cd, Tl, and Pb were within the safe limits. G. mangostana contained potent free radicals and cholinesterase enzyme inhibitors, whereas G. cambogia inhibited α-amylase enzyme more significantly. PCA and LDA discriminated the fruit samples with distinct classification and variability indices. The analyzed fruits were shown to be good sources of free radicals, cholinesterase, and α-amylase enzymes inhibition, mineral and essential elements, and safe for human consumption. 相似文献

13.

Sample classification for improved performance of PLS models applied to the quality control of deep-frying oils of different botanic origins analyzed using ATR-FTIR spectroscopy 总被引：1，自引：0，他引：1

Kuligowski J Carrión D Quintás G Garrigues S de la Guardia M 《Analytical and bioanalytical chemistry》2011,399(3):1305-1314

The selection of an appropriate calibration set is a critical step in multivariate method development. In this work, the effect of using different calibration sets, based on a previous classification of unknown samples, on the partial least squares (PLS) regression model performance has been discussed. As an example, attenuated total reflection (ATR) mid-infrared spectra of deep-fried vegetable oil samples from three botanical origins (olive, sunflower, and corn oil), with increasing polymerized triacylglyceride (PTG) content induced by a deep-frying process were employed. The use of a one-class-classifier partial least squares-discriminant analysis (PLS-DA) and a rooted binary directed acyclic graph tree provided accurate oil classification. Oil samples fried without foodstuff could be classified correctly, independent of their PTG content. However, class separation of oil samples fried with foodstuff, was less evident. The combined use of double-cross model validation with permutation testing was used to validate the obtained PLS-DA classification models, confirming the results. To discuss the usefulness of the selection of an appropriate PLS calibration set, the PTG content was determined by calculating a PLS model based on the previously selected classes. In comparison to a PLS model calculated using a pooled calibration set containing samples from all classes, the root mean square error of prediction could be improved significantly using PLS models based on the selected calibration sets using PLS-DA, ranging between 1.06 and 2.91% (w/w). 相似文献

14.

QSAR with quantum topological molecular similarity indices: toxicity of aromatic aldehydes to Tetrahymena pyriformis

S. Kar A.P. Harding P.L.A. Popelier 《SAR and QSAR in environmental research》2013,24(1-2):149-168

相似文献

15.

Chemometric analysis of groundwater quality data of alluvial aquifer of Gangetic plain, North India 总被引：10，自引：0，他引：10

Kunwar P. Singh Amrita Malik Vinod K. Singh Dinesh Mohan Sarita Sinha 《Analytica chimica acta》2005,550(1-2):82-91

Water quality data set from the alluvial region in the Gangetic plain in northern India, which is known for high fluoride levels in soil and groundwater, has been analysed by chemometric techniques, such as principal component analysis (PCA), discriminant analysis (DA) and partial least squares (PLS) in order to investigate the compositional differences between surface and groundwater samples, spatial variations in groundwater composition and influence of natural and anthropogenic factors. Trilinear plots of major ions showed that the groundwater in this region is mainly of Na/K-bicarbonate type. PCA performed on complete data matrix yielded six significant PCs explaining 65% of the data variance. Although, PCA rendered considerable data reduction, it could not clearly group and distinguish the sample types (dug well, hand-pump and surface water). However, a visible differentiation between the water samples pertaining to two watersheds (Khar and Loni) was obtained. DA identified six discriminating variables between surface and groundwater and also between different types of samples (dug well, hand pump and surface water). Distinct grouping of the surface and groundwater samples was achieved using the PLS technique. It further showed that the groundwater samples are dominated by variables having origin both in natural and anthropogenic sources in the region, whereas, variables of industrial origin dominate the surface water samples. It also suggested that the groundwater sources are contaminated with various industrial contaminants in the region. 相似文献

16.

Determination of biodiesel content when blended with mineral diesel fuel using infrared spectroscopy and multivariate calibration

Maria Fernanda Pimentel Grece M.G.S. Ribeiro Rosenira S. da Cruz Luiz Stragevitch Jos Geraldo A. Pacheco Filho Leonardo S.G. Teixeira 《Microchemical Journal》2006,82(2):201-206

In this work, multivariable calibration models based on middle- and near-infrared spectroscopy were developed in order to determine the content of biodiesel in diesel fuel blends, considering the presence of raw vegetable oil. Soybean, castor and used frying oils and their corresponding esters were used to prepare the blends with conventional diesel. Results indicated that partial least squares (PLS) models based on MID or NIR infrared spectra were proven suitable as practical analytical methods for predicting biodiesel content in conventional diesel blends in the volume fraction range from 0% to 5%. PLS models were validated by independent prediction set and the RMSEPs were estimated as 0.25 and 0.18 (%, v/v). Linear correlations were observed for predicted vs. observed values plots with correlation coefficient (R) of 0.986 and 0.994 for the MID and NIR models, respectively. Additionally, principal component analysis (PCA) in the MID region 1700 to 1800 cm^− 1 was suitable for identifying raw vegetable oil contaminations and illegal blends of petrodiesel containing the raw vegetable oil instead of ester. 相似文献

17.

An optimization‐based undeflated PLS (OUPLS) method to handle missing data in the training set

Eranda Harinath Puwakkatiya‐Kankanamage Salvador García‐Muoz Lorenz T. Biegler 《Journal of Chemometrics》2014,28(7):575-584

Advances in sensory systems have led to many industrial applications with large amounts of highly correlated data, particularly in chemical and pharmaceutical processes. With these correlated data sets, it becomes important to consider advanced modeling approaches built to deal with correlated inputs in order to understand the underlying sources of variability and how this variability will affect the final quality of the product. Additional to the correlated nature of the data sets, it is also common to find missing elements and noise in these data matrices. Latent variable regression methods such as partial least squares or projection to latent structures (PLS) have gained much attention in industry for their ability to handle ill‐conditioned matrices with missing elements. This feature of the PLS method is accomplished through the nonlinear iterative PLS (NIPALS) algorithm, with a simple modification to consider the missing data. Moreover, in expectation maximization PLS (EM‐PLS), imputed values are provided for missing data elements as initial estimates, conventional PLS is then applied to update these elements, and the process iterates to convergence. This study is the extension of previous work for principal component analysis (PCA), where we introduced nonlinear programming (NLP) as a means to estimate the parameters of the PCA model. Here, we focus on the parameters of a PLS model. As an alternative to modified NIPALS and EM‐PLS, this paper presents an efficient NLP‐based technique to find model parameters for PLS, where the desired properties of the parameters can be explicitly posed as constraints in the optimization problem of the proposed algorithm. We also present a number of simulation studies, where we compare effectiveness of the proposed algorithm with competing algorithms. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

18.

Chemometric analysis for optimizing derivatization in gas chromatography‐based procedures

Jolanta Kumirska Natalia Migowska Magda Caban Alina Plenis Piotr Stepnowski 《Journal of Chemometrics》2011,25(12):636-643

This paper focuses on the application of principal component analysis (PCA) to facilitate the optimization of the derivatization of oestrogenic steroids—estrone, 17β‐estradiol, estriol, 17α‐ethinylestradiol and diethylstilbestrol—in order to achieve (1) the complete derivatization of all the hydroxyl groups contained in the structure of the compounds and (2) the greatest effectiveness of this reaction. Six different derivatization reagents were used in this study, whereas 2‐methyl‐anthracene was applied as the internal standard to evaluate the effectiveness of the reactions. The experimental data were subjected to PCA. With PCA, the dimensionality of the original multivariable data set could be reduced and the selection of optimum conditions for derivatization facilitated. The mixture of 99% N,O‐bis(trimethylsilyl)trifluoroacetamide + 1% trimethylchlorosilane and pyridine (1:1, v/v) at 60 °C for 30 min has been established as the most convenient and efficient means of derivatizing the aforementioned oestrogenic steroids and diethylstilbestrol; the N‐methyl‐N‐(trimethylsilyl)trifluoroacetamide + pyridine (1:1, v/v) mixture seems to be a promising alternative. The application of PCA for optimizing the derivatization procedure, proposed for the first time in this study, is particularly useful in the development of multicomponent methods across several chemical classes of compounds. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

19.

Vibrational overtone combination spectroscopy (VOCSY)—a new way of using IR and NIR data

Alm E Bro R Engelsen SB Karlberg B Torgrip RJ 《Analytical and bioanalytical chemistry》2007,388(1):179-188

This work explores a novel method for rearranging 1st order (one-way) infra-red (IR) and/or near infra-red (NIR) ordinary spectra into a representation suitable for multi-way modelling and analysis. The method is based on the fact that the fundamental IR absorption and the first, second, and consecutive overtones of NIR absorptions represent identical chemical information. It is therefore possible to rearrange these overtone regions of the vectors comprising an IR and NIR spectrum into a matrix where the fundamental, 1st, 2nd, and consecutive overtones of the spectrum are arranged as either rows or columns in a matrix, resulting in a true three-way tensor of data for several samples. This tensorization facilitates explorative analysis and modelling with multi-way methods, for example parallel factor analysis (PARAFAC), N-way partial least squares (N-PLS), and Tucker models. The vibrational overtone combination spectroscopy (VOCSY) arrangement is shown to benefit from the “order advantage”, producing more robust, stable, and interpretable models than, for example, the traditional PLS modelling method. The proposed method also opens the field of NIR for true peak decomposition—a feature unique to the method because the latent factors acquired using PARAFAC can represent pure spectral components whereas latent factors in principal component analysis (PCA) and PLS usually do not. 相似文献

20.

Chemometric analysis of metabolism disorders in blood plasma of S180 and H22 tumor‐bearing mice by high performance liquid chromatography‐diode array detection

Xiaoming Sun Yun Liu Duolong Di Guotai Wu Hongyun Guo 《Journal of Chemometrics》2011,25(8):430-440

The aim of this paper is to characterize metabolism disorders in Kunming mice induced by S180 and H22 tumor cells. Metabolic fingerprint based on high performance liquid chromatography‐diode array detector (HPLC‐DAD) was developed to map the disturbed metabolic responses. In vivo testing of the antitumor activity of paclitaxel (Taxol) was carried out by inhibiting the growth of S180 and H22 tumor cells. Based on 27 common peaks, principal component analysis (PCA) and partial least squares‐discriminant analysis (PLS‐DA) were used to distinguish the abnormal from control and to find significant endogenous compounds (SECs) which have significant contributions to classification. The tumor growth inhibition ratios (TIRs) of Taxol groups were used to validate the predictive accuracies of the PLS‐DA models. The predictive accuracies of PLS‐DA models for S180 and H22 tumor model groups were 97.6 and 100%, respectively. Nine (S180) and seven (H22) SECs were discovered, including uric acid and cytidine. In addition, the correlations between relative tumor weights (RTWs) and chromatographic data for the SECs were significant (p < 0.05). Investigations on the stability and precision of the established metabolic fingerprints demonstrate that the experiment is well controlled and reliable. This work shows that the platform of HPLC‐DAD coupled with chemometric methods provides a promising method for the study of metabolism disorders induced by tumor cells. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献