首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
A general framework for the automatic alignment of one-dimensional chromatographic signals is presented in this article. The alignment of signals was achieved by explicitly modeling the warping function. Its shape was estimated using a linear combination of several B-spline functions. The coefficients of the spline functions were found in the course of an optimization procedure to maximize the Pearson's correlation coefficient between a target chromatogram and aligned chromatogram(s). The computational requirements of the method are discussed with respect to the correlation optimized warping method, frequently used for the alignment of chromatographic signals. As illustrated with two sets of one-dimensional chromatographic fingerprints, the automatic alignment approach performs well even when non-linear peak shifts need to be corrected. It can be applied in an on-the-fly manner since the alignment of signals is rapid.  相似文献   

2.
Instead of usual rationale for chromatographic fingerprint based sample identification which relies upon visual inspection or principal component analysis of raw or aligned chromatograms novel nonparametric statistical measure of fingerprint set homogeneity is proposed. Randomization test is applied for significance analysis of fingerprint set homogeneity while average maximum crosscorrelation is used as a merit function. Chromatogram sets generated by random selection from standard and unknown sample chromatogram collections are compared with respect to merit function values with set of chromatograms that represents standard and/or unknown sample. In that instance fingerprint homogeneity significance is represented by the fraction of random chromatogram sets that have higher merit values than the standard and/or unknown sample sets. A set of peptide maps corresponding to different haemoglobin variants has been selected for evaluation of proposed test. This approach is compared to chromatogram alignment based on correlation optimized warping coupled with principal component or cluster analysis. Proposed method is simple i.e. straightforward sample identification procedure which reliability has been evaluated here. Impact of this approach on peptide mapping validation and system suitability analysis is discussed.  相似文献   

3.
In this study, five frequently used warping algorithms [correlation optimized warping (COW), recursive alignment by fast Fourier transform (RAFFT), dynamic time warping, variable penalty dynamic warping, and parametric time warping (PTW)] are compared for their ability to align chromatograms with retention time shifts. Five datasets consisting of chromatograms of herbal medicines analyzed by high-performance liquid chromatography (HPLC) (Kudzuvine Root, White Paeony Root, Rehmannia Root, Ligusticum wallichii, Scutellaria baicalensis) are chosen to test these five alignment algorithms. The comparison shows all those five methods have misalignments with different degrees, but the correlations of the aligned data sets are all improved, especially for the data sets that are aligned by segment-wise: COW and RAFFT. After the comprehensive comparison, RAFFT wins the highest score, and then COW follows, whereas PTW is not preferable to align HPLC.  相似文献   

4.
In this paper the performance of three alignment algorithms, correlation optimized warping, parametric time warping and semi-parametric time warping, is compared on real chromatograms. Among these, parametric time warping is the simplest and fastest; generally less than 1s is required to align two chromatograms. It does not require the optimization of input parameters and allows the alignment of peak shifts in only one direction, or non-complex peak shifts in both directions. With correlation optimized warping and semi-parametric time warping complex peak shifts in both directions can be corrected but at the expense of the optimization of two input parameters. Semi-parametric time warping requires the selection of the proper number of B-splines in the warping function and, if necessary, the optimization of the penalty parameter. Often the default values can be used to obtain aligned signals. The optimization of the input parameters for correlation optimized warping (section length, slack) is not easy and time-consuming. Moreover, dependent on the input parameters, the computation time of the correlation optimized warping algorithm can be twice as long as for semi-parametric time warping for which computation times up to 23 s are required. However, the performance of both algorithms is equally good considering the improvement of the precision of the peak retention times and correlation coefficients between the chromatograms, after alignment. For the data aligned in this study, the average retention time precision and the lowest correlation before warping were 14 and 0.17, and were improved to three and 0.83, and six and 0.87 after warping, with correlation optimized warping and semi-parametric time warping, respectively.  相似文献   

5.
A rapid retention time alignment algorithm was developed as a preprocessing utility to be used prior to chemometric analysis of large datasets of diesel fuel profiles obtained using gas chromatography (GC). Retention time variation from chromatogram-to-chromatogram has been a significant impediment against the use of chemometric techniques in the analysis of chromatographic data due to the inability of current chemometric techniques to correctly model information that shifts from variable to variable within a dataset. The alignment algorithm developed is shown to increase the efficacy of pattern recognition methods applied to diesel fuel chromatograms by retaining chemical selectivity while reducing chromatogram-to-chromatogram retention time variations and to do so on a time scale that makes analysis of large sets of chromatographic data practical. Two sets of diesel fuel gas chromatograms were studied using the novel alignment algorithm followed by principal component analysis (PCA). In the first study, retention times for corresponding chromatographic peaks in 60 chromatograms varied by as much as 300 ms between chromatograms before alignment. In the second study of 42 chromatograms, the retention time shifting exhibited was on the order of 10 s between corresponding chromatographic peaks, and required a coarse retention time correction prior to alignment with the algorithm. In both cases, an increase in retention time precision afforded by the algorithm was clearly visible in plots of overlaid chromatograms before and then after applying the retention time alignment algorithm. Using the alignment algorithm, the standard deviation for corresponding peak retention times following alignment was 17 ms throughout a given chromatogram, corresponding to a relative standard deviation of 0.003% at an average retention time of 8 min. This level of retention time precision is a 5-fold improvement over the retention time precision initially provided by a state-of-the-art GC instrument equipped with electronic pressure control and was critical to the performance of the chemometric analysis. This increase in retention time precision does not come at the expense of chemical selectivity, since the PCA results suggest that essentially all of the chemical selectivity is preserved. Cluster resolution between dissimilar groups of diesel fuel chromatograms in a two-dimensional scores space generated with PCA is shown to substantially increase after alignment. The alignment method is robust against missing or extra peaks relative to a target chromatogram used in the alignment, and operates at high speed, requiring roughly 1 s of computation time per GC chromatogram.  相似文献   

6.
The alignment of analytical signals is an important preprocessing step when further analysis (e.g. PCA) requires the same lengths of all of them. Two techniques for alignment of profiles, namely dynamic time warping (DTW) and correlation optimized warping (COW) were tested and compared. The attention was focused on chromatographic and spectroscopic profiles. Simulated and two sets of real data were studied in this study.  相似文献   

7.
Liquid chromatography-mass spectrometry (LC/MS) has become the method of choice for characterizing complex mixtures. These analyses often involve quantitative comparison of components in multiple samples. To achieve automated sample comparison, the components of interest must be detected and identified, and their retention times aligned and peak areas calculated. This article describes a simple pairwise iterative retention time alignment algorithm, based on the divide-and-conquer approach, for alignment of ion features detected in LC/MS experiments. In this iterative algorithm, ion features in the sample run are first aligned with features in the reference run by applying a single constant shift of retention time. The sample chromatogram is then divided into two shorter chromatograms, which are aligned to the reference chromatogram the same way. Each shorter chromatogram is further divided into even shorter chromatograms. This process continues until each chromatogram is sufficiently narrow so that ion features within it have a similar retention time shift. In six pairwise LC/MS alignment examples containing a total of 6507 confirmed true corresponding feature pairs with retention time shifts up to five peak widths, the algorithm successfully aligned these features with an error rate of 0.2%. The alignment algorithm is demonstrated to be fast, robust, fully automatic, and superior to other algorithms. After alignment and gap-filling of detected ion features, their abundances can be tabulated for direct comparison between samples.  相似文献   

8.
In multivariate spectral calibration by principal component regression (PCR), the principal components (PCs) are calculated from the response data measured at all employed instrument channels; however some channels are redundant and their responses do not possess useful information. Thus, the extracted PCs possess mixed information from both useful and redundant channels. In this work, we propose a segmentation approach based on unsupervised pattern recognition to identify the most informative spectral region and then to construct a stable multivariate calibration model by PCR. In this method, the instrument channels are clustered into different segments via Kohonen self‐organization map. The spectral data of each segment are then subjected to PCA and the derived PCs are used as input variables for an inverse least square (ILS) regression model employing stepwise selection of the informative PCs. The proposed method was evaluated by the analysis of four simulated and six experimental data sets. It was found that our proposed method can model the above data sets with prediction errors lower than conventional partial least squares (PLS) and PCR methods. In addition, the prediction ability of our method was better than the previously reported models for these data sets. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

9.
Coffee samples were analyzed by GC/MS in order to determine the most important peaks for the discrimination of the varieties Arabica and Robusta. The resulting peak tables from chromatographic analysis were aligned and pretreated before being submitted to multivariate analysis. A rapid and easy-to-perform peak alignment procedure, which does not require advanced programming skills to use, was compared with the tedious manual alignment procedure. The influence of three types of data pretreatment, normalization, logarithmic and square root transformations and their combinations, on the variables selected as most important by the regression coefficients of partial least squares-discriminant analysis (PLS-DA), are shown. Test samples different from those used in the calibration and comparison with the substances already known as being responsible for Arabica and Robusta coffees discrimination were used to determine the best pretreatments for both datasets. The data pretreatment consisting of square root transformation followed by normalization (RN) was chosen as being the most appropriate. The results obtained showed that the much quicker automated aligned method could be used as a substitute for the manually aligned method, allowing all the peaks in the chromatogram to be used for multivariate analysis.  相似文献   

10.
Nowadays, numerous metabolite concentrations can readily be determined in a given biological sample by high-throughput analytical methods. However, such raw analytical data comprise noninformative components due to many disturbances normally occurring in the analyses of biological material. To eliminate those unwanted original analytical data components, advanced chemometric data preprocessing methods might be of help. Here, such methods are applied to electrophoretic nucleoside profiles in urine samples of cancer patients and healthy volunteers. In this study, three warping methods: dynamic time warping (DTW), correlation optimized warping (COW), and parametric time warping (PTW) were examined on two sets of electrophoretic data by means of quality of peaks alignment, time of preprocessing, and way of customization. The application of warping methods helped to limit shifting of peaks and enabled differentiation between whole electropherograms of healthy and cancer patients objectively by a principal component analysis (PCA). The evaluation of preprocessed data and raw data by PC analysis confirms differences between the applied warping tools and proves their suitability in metabonomic data interpretation.  相似文献   

11.
Metabolic dataset can provide an overview of different herbal origin, which is conducted by some statistical procedures. Such results often deviate to a certain degree, due to peaks shifts in chromatographic signals. In order to solve this problem, an improved algorithm of combining sub‐window factor analysis with the mass spectrum information is proposed. The algorithm uses a peak detection approach derived either from multi‐scale Gaussian function or Haar wavelet to locate the peaks with different application scope; the candidate drift points at each peak are estimated by Fast Fourier transform cross correlation; Specifically, the best drift points at each candidate peaks are confirmed by sub‐window factor analysis and mass spectrum information in nontargeted metabolic profiling. Finally, the peak regions were aligned against a reference chromatogram, and the non‐peak regions were used linear interpolation. The chromatographic signals of 30 Bupleurum samples were aligned as an illustration of this algorithm, and they could be well distinguished using some statistical procedures. The result demonstrates that the presented method is stronger than other mass‐spectra based algorithms, when facing the alignment of some co‐eluted peaks.  相似文献   

12.
In this article, a new large‐scale aligned fiber mats formation method called salt‐induced pulse electrospinning was developed. By electrospinning salted solution in a humid environment, traditional continuous electrospinning changed into pulse electrospinning and aligned fibers were thus formed. The possible mechanisms for the occurrence of salt‐induced pulse electrospinning and the formation of fiber alignment were studied. The continuous electrospinning changing into the pulse electrospinning was due to the change of viscosity and conductivity of salted polymer solution in a wet electrospinning condition. Fishing net‐shaped whipping region of the electrospinning jet during pulse electrospinning process was considered as the key factor for the formation of fiber alignment. The mechanical properties of the aligned fiber mat increased significantly compared with that of the random fiber mat. This aligned fiber preparation method only requires a very low rotating drum speed as the receiver and can produce large‐scale aligned fiber mats for many applications. © 2012 Wiley Periodicals, Inc. J Polym Sci Part B: Polym Phys, 2012  相似文献   

13.
Comprehensive, two-dimensional gas chromatography (GC x GC) is used in conjunction with trilinear partial least squares (Tri-PLS) to quantify the percent weight of naphthalenes (two-ring aromatic compounds) in jet fuel samples. The increased peak capacity and selectivity of GC x GC makes the technique attractive for the rapid, and possibly less tedious analysis of jet fuel. The analysis of complex mixtures by GC x GC is further enhanced through the use of chemometric techniques, including those designed for use on 2-D data such as Tri-PLS. Unfortunately, retention time variation, unless corrected, can be an impediment to chemometric analysis. Previous work has demonstrated that the effects of retention time variation can be mitigated in sub-regions of GC x GC chromatograms through the application of an objective retention time alignment algorithm based on rank minimization. Building upon this previous work, it is demonstrated here that the effects of retention time variation can be mitigated throughout an entire GC x GC chromatogram with an objective retention time alignment algorithm based on windowed rank minimization alignment. A significant decrease in calibration error is observed when the algorithm is applied to chromatograms prior to construction of Tri-PLS models. Fourteen jet fuel samples with known weight percentages of naphthalenes (ASTM D1840) were obtained. Each sample was subjected to five replicate five-minute GC x GC separations over a period of two days. A subset of nine samples spanning the range of weight percentages of naphthalenes was chosen as a calibration set and Tri-PLS calibration models were subsequently developed in order to predict the naphthalene content of the samples from the GC x GC chromatograms of the remaining five samples. Calibration models constructed from GC x GC chromatograms that were retention time corrected are shown to exhibit a root mean square error of prediction of roughly half that of calibration models constructed from uncorrected chromatograms. The error of prediction is lowered further to a value that nearly matches the uncertainty in the standard percent weight values (ca. 1% of the median percent volume value) when the aligned chromatograms are truncated to include only regions of the chromatogram populated by naphthalenes and compounds of similar polarity and boiling point.  相似文献   

14.
The preprocessing of chromatograms, such as the alignment of retention time shifts, is often a crucial step in the proper data analysis chain. Here, an efficient approach to align shifted chromatographic signals, longest distance shifting, is presented and highlighted. The performance of this novel strategy was demonstrated by using both simulated chromatograms that covered the different kinds of retention time shifts and the real experimental chromatograms of Pudilan Xiaoyan Tablets obtained by high‐performance liquid chromatography with photodiode array detection. The averaged correlation coefficient for experimental chromatograms were in the range of 0.9517–0.9840 and the peak factor was 0.9989. As a comparison, all the chromatograms have also been aligned using correlation optimized warping and Interval Correlation Optimized Shifting algorithms. The obtained results indicate that the longest distance shifting algorithm is simpler, faster and more effective, and will be potentially suitable for the alignment of other types of signals.  相似文献   

15.
利用高效液相色谱全轮廓指纹图谱结合化学计量学方法对不同栽培地区的紫苏叶样品(共84个)进行区分。全轮廓色谱数据经自适应迭代加权最小二乘法(airPLS)和相关优化翘曲法(COW)校正后,基线和保留时间漂移现象均得到明显改善。经预处理后的色谱数据采用主成分分析(PCA)进行解析,结果表明不同来源的样品能按其特性各自聚为一类;而分段间隔压缩变量后的色谱数据经主成分分析处理可得到与全轮廓色谱数据为输入变量时相一致的结果。此外,偏最小二乘判别分析(PLS-DA)对于紫苏叶样品分类的识别能力和预报能力分别为92.8%和89.6%。  相似文献   

16.
Proton nuclear magnetic resonance (1H NMR) spectroscopic analysis of mixtures has been used extensively for a variety of applications ranging from the analysis of plant extracts, wine, and food to the evaluation of toxicity in animals. For example, NMR analysis of urine samples has been used extensively for biomarker discovery and, more simply, for the construction of classification models of toxicity, disease, and biochemical phenotype. However, NMR spectra of complex mixtures typically show unwanted local peak shifts caused by matrix and instrument variability, which must be compensated for prior to statistical analysis and interpretation of the data. One approach is to align the spectral peaks across the data set. An efficient and fast warping algorithm is required as the signals typically contain ca. 32,000-64,000 data points and there can be several thousand spectra in a data set. As demonstrated in our study, the iterative fuzzy warping algorithm fulfills these requirements and can be used on-line for an alignment of the NMR spectra. Correlation coefficients between the aligned and target spectra are used as the evaluation function for the algorithm, and its performance is compared with those of other published warping methods.  相似文献   

17.
Toxicity of chemicals induced by different factors is an important consideration, especially during the drug research and development process. Thus, there is urgent need to develop computationally effective models that can predict the toxicity or adverse effects of chemicals for a specific class of chemicals. In this study, random forest (RF) was used to classify five toxicity data sets from Distributed Structure‐Searchable Toxicity database network, using substructure fingerprints calculated directly from simple molecular structure. Three model validation approaches, out‐of‐bag validation incorporated in RF, fivefold cross‐validation, and an independent validation set, were used for assessing the prediction capability of our models. The chemical space analysis of data sets was explored by multidimensional scaling plots, and outlying molecules were also detected by the proximity measure in RF. At the same time, the important substructure fingerprints, recognized by the RF technique, gave some insights into the structure features related to toxicity of chemicals. The results obtained showed that these in silico classification models with substructure patterns and RF are applicable for potential toxicity prediction of chemical compounds. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

18.
Simulated chromatographic separations were used to study the performance of piecewise retention time alignment and to demonstrate automated unsupervised (without a training set) parameter optimization. The average correlation coefficient between the target chromatogram and all remaining chromatograms in the data set was used to optimize the alignment parameters. This approach frees the user from providing class information and makes the alignment algorithm applicable to classifying completely unknown data sets. The average peak in the raw simulated data set was shifted up to two peak-widths-at-base (average relative shift=2.0) and after alignment the average relative shift was improved to 0.3. Piecewise alignment was applied to severely shifted GC separations of gasolines and reformate distillation fraction samples. The average relative shifts in the raw gasolines and reformates data were 4.7 and 1.5, respectively, but after alignment improved to 0.5 and 0.4, respectively. The effect of piecewise alignment on peak heights and peak areas is also reported. The average relative difference in peak height was -0.20%. The average absolute relative difference in area was 0.15%.  相似文献   

19.
The Interval Correlation Optimised Shifting algorithm (icoshift) has recently been introduced for the alignment of nuclear magnetic resonance spectra. The method is based on an insertion/deletion model to shift intervals of spectra/chromatograms and relies on an efficient Fast Fourier Transform based computation core that allows the alignment of large data sets in a few seconds on a standard personal computer. The potential of this programme for the alignment of chromatographic data is outlined with focus on the model used for the correction function. The efficacy of the algorithm is demonstrated on a chromatographic data set with 45 chromatograms of 64,000 data points. Computation time is significantly reduced compared to the Correlation Optimised Warping (COW) algorithm, which is widely used for the alignment of chromatographic signals. Moreover, icoshift proved to perform better than COW in terms of quality of the alignment (viz. of simplicity and peak factor), but without the need for computationally expensive optimisations of the warping meta-parameters required by COW. Principal component analysis (PCA) is used to show how a significant reduction on data complexity was achieved, improving the ability to highlight chemical differences amongst the samples.  相似文献   

20.
Yao W  Yin X  Hu Y 《Journal of chromatography. A》2007,1160(1-2):254-262
The alignment of chromatographic signals is an important preprocessing step before further multivariate analysis. This paper presents a method, automated peak alignment by beam search (Auto-PABS), to solve the problem of peak shift in chemical chromatographic fingerprints by piecewise shifting and linearly interpolating. It is characterized by searching an adaptive range for the values of shifting and linearly interpolating of each segment. This search range is estimated by the calculation of fast Fourier transform cross correlation between the sample segment and its corresponding reference segment. Thus, arbitrary peak alignment is avoided when the real peak shifts are unknown in a large data set. Since the maximum of search range is close to the real shift, more accurate beam search is adopted to accomplish the optimization process. Simulated data and herbal medicine fingerprints of HPLC and GC are selected for evaluation. The output matrix of aligned chromatographic profiles is used directly for principal components analysis, yielding satisfactory results on real samples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号