首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Large datasets containing many spectra commonly associated with in situ or operando experiments call for new data treatment strategies as conventional scan by scan data analysis methods have become a time-consuming bottleneck. Several convenient automated data processing procedures like least square fitting of reference spectra exist but are based on assumptions. Here we present the application of multivariate curve resolution (MCR) as a blind-source separation method to efficiently process a large data set of an in situ X-ray absorption spectroscopy experiment where the sample undergoes a periodic concentration perturbation. MCR was applied to data from a reversible reduction–oxidation reaction of a rhenium promoted cobalt Fischer–Tropsch synthesis catalyst. The MCR algorithm was capable of extracting in a highly automated manner the component spectra with a different kinetic evolution together with their respective concentration profiles without the use of reference spectra. The modulative nature of our experiments allows for averaging of a number of identical periods and hence an increase in the signal to noise ratio (S/N) which is efficiently exploited by MCR. The practical and added value of the approach in extracting information from large and complex datasets, typical for in situ and operando studies, is highlighted.  相似文献   

2.
3.
Rotation ambiguity (RA) in multivariate curve resolution (MCR) is an undesirable case, when the physicochemical constraints are not sufficiently strong to provide a unique resolution of the data matrix of the mixtures into spectra and concentration profiles of individual chemical components. RA is often met in MCR of overlapped chromatographic peaks, kinetic and equilibrium data, and fluorescence two‐dimensional spectra. In case of RA, a single candidate solution has little practical value. So, the whole set of feasible solutions should be characterized somehow. It is a quite intricate task in a general case. In the present paper, a method was proposed to estimate RA with charged particle swarm optimization (cPSO), a population‐based algorithm. The criteria for updating the particles were modified, so that the swarm converged to the steady state, which spanned the set of feasible solutions. The performance of cPSO‐MCR was demonstrated on test functions, simulated datasets, and real‐world data. Good accordance of the cPSO‐MCR results with the analytical solutions (Borgen plots) was observed. cPSO‐MCR was also shown to be capable of estimating the strength of the constraints and of revealing RA in noisy data. As compared with analytical methods, cPSO‐MCR is simpler to implement, expands to more than three chemical compounds, is immune to noise, and can be easily adapted to virtually all types of constraints and objective functions (constraint based or residue based). cPSO‐MCR also provides natural visual information about the level of RA in spectra and concentration profiles, similar to the methods of two extreme solutions (e.g., MCR‐BANDS). Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

4.
Multivariate methods, such as principal component analysis (PCA) and multivariate curve resolution (MCR), are often employed to aid the analysis of large complex data sets such as time‐of‐flight secondary ion mass spectrometry (ToF‐SIMS) images. There is, however, much confusion over the most appropriate choice of method for any given application and the effects of data preprocessing, which is exacerbated by the confusing terminologies and the use of jargon in this field. In the present study, a simple model system consisting of a ToF‐SIMS image of an immiscible polymer blend is used to evaluate PCA and MCR in the accurate identification, localisation and quantification of the phase‐separated polymer domains, using four data preprocessing methods (no scaling, normalisation, variance scaling and Poisson scaling). This highlights significant issues and challenges in the quantitative multivariate analysis of mixed organic systems, including the discrimination of chemically significant features from experimental noise, the resolution of weak chemical contributions and potential bias introduced by data preprocessing. Multivariate analysis using Poisson scaling, identified as the most suitable data preprocessing method for both PCA and MCR, demonstrates a marked improvement upon traditional (manual) analysis and provides valuable additional information that is difficult to detect using traditional analysis. Using these results, we present recommendations for the optimum use of multivariate analysis by analysts and provide guidance on selecting the most appropriate methods. Confusing terminology is also clarified. © Crown copyright 2008. Reproduced with the permission of Her Majesty's Stationery Office. Published by John Wiley & Sons, Ltd.  相似文献   

5.
A method is introduced that allows one to select, for a given property and compound, among several prediction methods the presumably best-performing scheme based on prediction errors evaluated for structurally similar compounds. The latter are selected through analysis of atom-centered fragments (ACFs) in accord with a k nearest neighbor procedure in the two-dimensional structural space. The approach is illustrated with seven estimation methods for the water solubility of organic compounds and a reference set of 1876 compounds with validated experimental values. The discussion includes a comparison with the similarity-based error correction as an alternative approach to improve the performance of prediction methods and an extension that enables an ad hoc specification of the application domain.  相似文献   

6.
MULVADO is a newly developed software package for DOSY NMR data processing, based on multivariate curve resolution (MCR), one of the principal multivariate methods for processing DOSY data. This paper will evaluate this software package by using real-life data of materials used in the printing industry: two data sets from the same ink sample but of different quality. Also a sample of an organic photoconductor and a toner sample are analysed. Compared with the routine DOSY output from monoexponential fitting, one of the single channel algorithms in the commercial Bruker software, MULVADO provides several advantages. The key advantage of MCR is that it overcomes the fluctuation problem (non-consistent diffusion coefficient of the same component). The combination of non-linear regression (NLR) and MCR can yield more accurate resolution of a complex mixture. In addition, the data pre-processing techniques in MULVADO minimise the negative effects of experimental artefacts on the results of the data. In this paper, the challenges for analysing polymer samples and other more complex samples will also be discussed.  相似文献   

7.
Comprehensive two-dimensional gas chromatography (GC x GC) offers new opportunities to develop relationships between molecular structure and retentions in the two dimensional (2D) separation space defined by the GC x GC retention in each dimension. Whereas single dimension GC provides only one retention property for a solute, and hence the specific relationship between retention and chemical property is not readily apparent or derivable, the 2D presentation of compounds in GC x GC provides a subtle and exquisite correlation of chemical property and retention unlike any other GC experiment. The 'orthogonality' of the two separation dimensions is intimately related to the manner in which different separation mechanisms, available through use of two dissimilar phases, are accessible to the different chemical compounds or classes in a sample mixture, and indeed the specific chemical classes present in the sample. The GC x GC experiment now permits various processes such as chemical decompositions, molecular interconversions, various non-linear chromatography effects, and processes such as slow reversible interactions that may arise with stationary phases or in the injector or column couplings, to be identified and further investigated. Here, we briefly review implementation of the GC x GC method, consider the molecular selectivity of GC x GC, and highlight a selection of molecular processes that can be probed by using GC x GC.  相似文献   

8.
High-throughput screening (HTS) campaigns in pharmaceutical companies have accumulated a large amount of data for several million compounds over a couple of hundred assays. Despite the general awareness that rich information is hidden inside the vast amount of data, little has been reported for a systematic data mining method that can reliably extract relevant knowledge of interest for chemists and biologists. We developed a data mining approach based on an algorithm called ontology-based pattern identification (OPI) and applied it to our in-house HTS database. We identified nearly 1500 scaffold families with statistically significant structure-HTS activity profile relationships. Among them, dozens of scaffolds were characterized as leading to artifactual results stemming from the screening technology employed, such as assay format and/or readout. Four types of compound scaffolds can be characterized based on this data mining effort: tumor cytotoxic, general toxic, potential reporter gene assay artifact, and target family specific. The OPI-based data mining approach can reliably identify compounds that are not only structurally similar but also share statistically significant biological activity profiles. Statistical tests such as Kruskal-Wallis test and analysis of variance (ANOVA) can then be applied to the discovered scaffolds for effective assignment of relevant biological information. The scaffolds identified by our HTS data mining efforts are an invaluable resource for designing SAR-robust diversity libraries, generating in silico biological annotations of compounds on a scaffold basis, and providing novel target family specific scaffolds for focused compound library design.  相似文献   

9.
10.
In common with all gas chromatography (GC) methods, comprehensive two-dimensional gas chromatography (GC x GC) has the potential to provide both qualitative and quantitative analysis. There are fundamental differences in the way one-dimensional (1D-GC) and GC x GC results are interpreted for these parameters. Since 1D-GC produces a single measured peak in the chromatogram, there is a single retention time, and associated with this a single peak response (either area or height). Peak area and height are related by peak width. GC x GC produces a series of modulated peaks at the detector. Thus, the peak metrics of retention, area and height for one component are now not simple single values for one peak, but rather are derived from the multiple peak distribution generated by the modulation process. The peak retention is interpreted in terms of two-dimensional coordinates in a retention plane. In this study, a brief background review to quantification in GC x GC is provided. Previous reviews cover aspects of quantitative GC x GC studies up to the year 2005, including different approaches to quantification, and reports of quantitative analysis with different detectors, for different compounds classes, and in different matrices. Other studies have developed chemometric approaches based on multivariate analysis to provide quantitative reporting of individual compounds. The coverage of the earlier reviews has been updated to include material that has been presented since 2005 and includes considerations of valve-based modulation. Recently the modulation ratio (M(R)) concept was proposed and intended to clarify the meaning of modulation number (n(M)) in GC x GC, which was shown to be a rather poorly defined parameter. Based on the prior studies that introduced this concept, the role of quantitative analysis is investigated here through calculation of the peak areas and peak area ratios of selected series of modulated peaks in GC x GC. The application of isotopically labelled reference compounds for polycyclic aromatic hydrocarbon (PAH) analysis is used here to develop the quantitative metric approach. It is shown that by selecting the two or three major modulated peaks for solutes and internal standards, comparing the response ratio with the sum of all modulated peaks and also with the reference non-modulated result, quantification is statistically equivalent. Thus, adequate quantitative analysis and calibration can be accomplished by using selected major modulated peaks for each compound. This may simplify quantitative interpretation of GC x GC data.  相似文献   

11.
A new series of pyrido[1,2-α]benzimidazole derivatives bearing the aryloxypyrazole nucleus have been synthesized by base-catalyzed cyclocondensation reaction through multi-component reaction(MCR) approach.All the synthesized compounds were investigated against a representative panel of pathogenic strains using broth microdilution minimum inhibitory concentration(MIC) method for their in vitro antimicrobial activity.Reviewing the data,majority of the compounds were found to be active against employed pathogens.SAR study explores that antimicrobial activity is strongly depends on the nature of the substituents at the ether linked aryl ring attached to the pyrazole unit,together with the substituent present on the C5 of the benzimidazole unit.  相似文献   

12.
A two-step methodology has been developed for the prediction of protein retention time in linear-gradient HIC systems. Isocratic retention parameters were determined from ln(k')-salt concentration plots for a number of commercially available proteins with a range of properties. Quantitative structure property relationship (QSPR) models based on a support vector machine (SVM) approach were generated for predicting isocratic retention parameters for proteins not included in the model generation. The predicted parameters were then used to calculate protein gradient retention times and the results indicate that this approach is well suited for predicting experimental gradient retention data. The approach presented in this paper may have implications for HIC methods development at both the bench and process scales.  相似文献   

13.
In this work, two different maximum likelihood approaches for multivariate curve resolution based on maximum likelihood principal component analysis (MLPCA) and on weighted alternating least squares (WALS) are compared with the standard multivariate curve resolution alternating least squares (MCR‐ALS) method. To illustrate this comparison, three different experimental data sets are used: the first one is an environmental aerosol source apportionment; the second is a time‐course DNA microarray, and the third one is an ultrafast absorption spectroscopy. Error structures of the first two data sets were heteroscedastic and uncorrelated, and the difference between them was in the existence of missing values in the second case. In the third data set about ultrafast spectroscopy, error correlation between the values at different wavelengths is present. The obtained results confirmed that the resolved component profiles obtained by MLPCA‐MCR‐ALS are practically identical to those obtained by MCR‐WALS and that they can differ from those resolved by ordinary MCR‐ALS, especially in the case of high noise. It is shown that methods that incorporate uncertainty estimations (such as MLPCA‐ALS and MCR‐WALS) can provide more reliable results and better estimated parameters than unweighted approaches (such as MCR‐ALS) in the case of the presence of high amounts of noise. The possible advantage of using MLPCA‐MCR‐ALS over MCR‐WALS is then that the former does not require changing the traditional MCR‐ALS algorithm because MLPCA is only used as a preliminary data pretreatment before MCR analysis. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

14.
A review of recent results of the use of chromatographic retention data in peptide identification and in the development of procedures for peptide retention prediction is presented. In recent years, reversed phase LC (RP-LC) has become an important tool in the separation of peptides in MS analysis. A challenging problem in a further expansion of RP-LC applications is the use of already available retention information for the identification purposes simultaneously with MS–MS identification. This overview focuses on the retention characteristics suggested in LC. We will discuss the application of the retention index concept in LC, which is widely used in GC to characterize retention of organic compounds. The use of retention indices as retention characteristics of analytes in LC was first suggested at the end of 1970s, however the application of retention indices is still somewhat rare today. There are several reasons for this. One is the relatively high sensitivity and variability of retention indices to the change of parameters of chromatographic systems. Another is the chemical restrictions in the search of the universal set of reference compounds suitable for retention scaling. Several methods were suggested for the prediction of the retention times of peptides. A frequently used approach is based on the additivity scheme and calculation of the elution time through the summation of retention coefficients of amino acids constituting the peptide. Such an approach allows fairly accurate predictions of the retention time of peptides made up of not more then 15–20 amino acid residues. Additional correction factors were suggested to improve predictions including corrections for the peptide length, peptide hydrophobicity, sequence of amino acids, etc. Suggested procedures are discussed in detail. Application of predicted retention times in the identification of peptides is considered. Current status of LC retention data collections is presented.  相似文献   

15.
A comprehensive understanding of factors that influence microbial competition and cooperation, their diversity and processes will be greatly beneficial in many research areas. Current tools for microflora determinations are far from suitable for high‐throughput monitoring of development in complex microbial communities. Here, we describe the application of a calibration free method, multivariate curve resolution with alternating least squares (MCR‐ALS), for identification and quantification of different microbes in mixture samples. The idea is to utilize MCR‐ALS to enable close monitoring of ecology in a variety of microbial communities. The data from two designed experiments consisting of DNA sequence spectra measured on mixtures were analysed with MCR‐ALS using no prior information on the data except for appropriate constraints, such as non‐negativity and closure. The results were compared both to the known true concentrations as well as to the results obtained from the well‐established multivariate calibration method partial least squares (PLS) regression. MCR‐ALS performed as well as PLS regression, successfully extracting all pure bacterial spectra and quantitative information on these, with 97.81% and 97.91% explained variance for the first and the second data set, respectively. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

16.
In Part I of this work, we developed a method for the detection of drugs of abuse in biological samples based on fast gradient elution liquid-chromatography coupled with diode array spectroscopic detection (LC-DAD). In this part of the work, we apply the chemometric method of target factor analysis (TFA) to the chromatograms. This algorithm identifies the target compounds present in chromatograms based on a spectral library, resolves nearly co-eluting components, and differentiates between drugs with similar spectra. The ability to resolve highly overlapped peaks using the spectral data afforded by the DAD is what distinguishes the present method from conventional library searching methods. Our library has a mean list length (MLL) of 1.255 and a discriminating power of 0.997 when both retention index and spectral factors are considered. The algorithm compares a library of 47 different compounds of toxicological relevance to unknown samples and identifies which compounds are present based on spectral and retention index matching. The application of a corrected retention index for identification rather than raw retention times compensates for long-term and column-to-column retention time shifts and allows for the use of a single library of spectral and retention data. Training data sets were used to establish the search and identification parameters of the method. A validation data set of 70 chromatograms was used to calculate the sensitivity (correct identification of positives) and specificity (correct identification of negatives) of the method, which were found to be 92% and 94%, respectively.  相似文献   

17.
Multivariate curve resolution (MCR) is a widespread methodology for the analysis of process data in many different application fields. This article intends to propose a critical review of the recently published works. Particular attention will be paid to situations requiring advanced and tailored applications of multivariate curve resolution, dealing with improvements in preprocessing methods, multi-set data arrangements, tailored constraints, issues related to non-ideal noise structure and deviation to linearity. These analytical issues are tackling the limits of applicability of MCR methods and, therefore, they can be considered as the most challenging ones.  相似文献   

18.
Summary The use of theoretically calculated molecular properties as predictors for retention in reversed-phase HPLC has been explored. HPLC retention times have been measured for a series of 47 substituted aromatic molecules in three solvent mixtures and steric and electronic properties of these compounds have been derived using semi-empirical molecular orbital and empirical theoretical methods. A subset of the experimental data (a training set) was used to derive property-retention time relationships and the remaining data were then used to test the predictive capability of the methods.Good retention time prediction was possible using derived regression equations for individual solvents and after including solvent parameters it was possible to predict retention for all solvents using a single equation. This method showed that the most useful properties were calculated log P and the calculated dipole moment of the solutes, and the calculated solvent polarisability. In addition, 90% of the data were used to train an artificial neural network and the remaining 10% of the data used to test the network; excellent prediction was obtained, the neural network approach being as successful as the regression analysis.  相似文献   

19.
Multivariate curve resolution–alternating least squares (MCR–ALS) analysis is proposed to solve chromatographic challenges during two-dimensional gas chromatography–time-of-flight mass spectrometry (GC?×?GC–TOFMS) analysis of complex samples, such as crude oil extract. In view of the fact that the MCR–ALS method is based on the fulfillment of the bilinear model assumption, three-way and four-way GC?×?GC–TOFMS data are preferably arranged in a column-wise superaugmented data matrix in which mass-to-charge ratios (m/z) are in its columns and the elution times in the second and first chromatographic columns are in its rows. Since m/z values are common for all measured spectra in all second-column modulations, unavoidable chromatographic challenges such as retention time shifts within and between GC?×?GC–TOFMS experiments are properly handled. In addition, baseline/background contributions can be modeled by adding extra components to the MCR–ALS model. Another outstanding aspect of MCR–ALS analysis is its extreme flexibility to consider all samples (standards, unknowns, and replicates) in a single superaugmented data matrix, allowing joint analysis. In this way, resolution, identification, and quantification results can be simultaneously obtained in a very fast and reliable way. The potential of MCR–ALS analysis is demonstrated in GC?×?GC–TOFMS analysis of a North Sea crude oil extract sample with relative errors in estimated concentrations of target compounds below 6.0 % and relative standard deviations lower than 7.0 %. The results obtained, along with reasonable values for the lack of fit of the MCR–ALS model and high values of the reversed match factor in mass spectra similarity searches, confirm the reliability of the proposed strategy for GC?×?GC–TOFMS data analysis.   相似文献   

20.
An approach to rapidly process and interpret high-throughput liquid chromatography mass spectrometry data is presented. This approach applies an in-house developed computer application to process LC-MS report files containing spectral and chromatographic data from four different detectors (i.e. electrospray positive ionization, electrospray negative ionization mass spectrometry, UV absorption, and evaporative light scattering detection). Properties characteristic of detection and chromatographic retention are extracted and populated into a database. Approaches to applying this analytical information database for quality control analysis of ca. 400,000 samples are presented. Compound quality assessment methods employing average purity and detection data fields are compared to methods employing multiple quality control criteria (e.g. detection, purity, retention, and signal to noise). Structural similarity searches were applied with the analytical information database to identify compounds that may be undetectable by electrospray mass spectrometry. In addition, an approach to applying the database to aid in the selection of analytical detection and chromatography conditions for rapid analytical method development is also discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号