首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The Interval Correlation Optimised Shifting algorithm (icoshift) has recently been introduced for the alignment of nuclear magnetic resonance spectra. The method is based on an insertion/deletion model to shift intervals of spectra/chromatograms and relies on an efficient Fast Fourier Transform based computation core that allows the alignment of large data sets in a few seconds on a standard personal computer. The potential of this programme for the alignment of chromatographic data is outlined with focus on the model used for the correction function. The efficacy of the algorithm is demonstrated on a chromatographic data set with 45 chromatograms of 64,000 data points. Computation time is significantly reduced compared to the Correlation Optimised Warping (COW) algorithm, which is widely used for the alignment of chromatographic signals. Moreover, icoshift proved to perform better than COW in terms of quality of the alignment (viz. of simplicity and peak factor), but without the need for computationally expensive optimisations of the warping meta-parameters required by COW. Principal component analysis (PCA) is used to show how a significant reduction on data complexity was achieved, improving the ability to highlight chemical differences amongst the samples.  相似文献   

2.
In metabolic profiling, multivariate data analysis techniques are used to interpret one-dimensional (1D) 1H NMR data. Multivariate data analysis techniques require that peaks are characterised by the same variables in every spectrum. This location constraint is essential for correct comparison of the intensities of several NMR spectra. However, variations in physicochemical factors can cause the locations of the peaks to shift. The location prerequisite may thus not be met, and so, to solve this problem, alignment methods have been developed. However, current state-of-the-art algorithms for data alignment cannot resolve the inherent problems encountered when analysing NMR data of biological origin, because they are unable to align peaks when the spatial order of the peaks changes—a commonly occurring phenomenon. In this paper a new algorithm is proposed, based on the Hough transform operating on an image representation of the NMR dataset that is capable of correctly aligning peaks when existing methods fail. The proposed algorithm was compared with current state-of-the-art algorithms operating on a selected plasma dataset to demonstrate its potential. A urine dataset was also processed using the algorithm as a further demonstration. The method is capable of successfully aligning the plasma data but further development is needed to address more challenging applications, for example urine data. Figure Traces of NMR peaks visualizing the Generalized Fuzzy Hough Transform (GFHT) method for elucidating peak correspondence between samples. The spectra are sorted according to one shift sensitive peak and reveals that other peaks exhibit a similar shift pattern. This pattern(s) can now be searched for using the GFHT. The red and black spectra in the figure are the most shifting spectra (top and bottom), by following the GFHT traces peak correspondence is easily established although peaks change spatial location Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

3.
In this study, five frequently used warping algorithms [correlation optimized warping (COW), recursive alignment by fast Fourier transform (RAFFT), dynamic time warping, variable penalty dynamic warping, and parametric time warping (PTW)] are compared for their ability to align chromatograms with retention time shifts. Five datasets consisting of chromatograms of herbal medicines analyzed by high-performance liquid chromatography (HPLC) (Kudzuvine Root, White Paeony Root, Rehmannia Root, Ligusticum wallichii, Scutellaria baicalensis) are chosen to test these five alignment algorithms. The comparison shows all those five methods have misalignments with different degrees, but the correlations of the aligned data sets are all improved, especially for the data sets that are aligned by segment-wise: COW and RAFFT. After the comprehensive comparison, RAFFT wins the highest score, and then COW follows, whereas PTW is not preferable to align HPLC.  相似文献   

4.
In chromatography-based metabonomic research, retention time (RT) alignment of chromatographic peaks poses a challenge for the accurate profiling of biomarkers. Although a number of RT alignment software has been reported, the performance of these software packages have not been comprehensively evaluated. This study aimed to evaluate the RT alignment accuracy of publicly available and commercial RT alignment software. Two gas chromatography/mass spectrometry (GC/MS) datasets acquired from a mixture of standard metabolites and human bladder cancer urine samples, were used to assess three publicly available software packages, MetAlign, MZmine and TagFinder, and two commercial applications comprising the Calibration feature and Statistical Compare of ChromaTOF software. The overall RT alignment accuracies in aligning standard compounds mixture were 93, 92, 74, 73 and 42% for Calibration feature, MZmine, MetAlign, Statistical Compare and TagFinder, respectively. Additionally, unique trends were observed for the individual software with regards to the different experimental conditions related to extent and direction of RT shifts. Conflicting performance was observed for human urine samples suggesting that RT misalignments still occurred despite the use of RT alignment software. While RT alignment remains an inevitable step in data preprocessing, metabonomic researchers are recommended to perform manual check on the RT alignment of important biomarkers as part of their validation process.  相似文献   

5.
Nowadays, numerous metabolite concentrations can readily be determined in a given biological sample by high-throughput analytical methods. However, such raw analytical data comprise noninformative components due to many disturbances normally occurring in the analyses of biological material. To eliminate those unwanted original analytical data components, advanced chemometric data preprocessing methods might be of help. Here, such methods are applied to electrophoretic nucleoside profiles in urine samples of cancer patients and healthy volunteers. In this study, three warping methods: dynamic time warping (DTW), correlation optimized warping (COW), and parametric time warping (PTW) were examined on two sets of electrophoretic data by means of quality of peaks alignment, time of preprocessing, and way of customization. The application of warping methods helped to limit shifting of peaks and enabled differentiation between whole electropherograms of healthy and cancer patients objectively by a principal component analysis (PCA). The evaluation of preprocessed data and raw data by PC analysis confirms differences between the applied warping tools and proves their suitability in metabonomic data interpretation.  相似文献   

6.
Metabolomics is used to reduce the complexity of plants and to understand the underlying pathways of the plant phenotype. The metabolic profile of plants can be obtained by mass spectrometry or liquid-state NMR. The extraction of metabolites from the sample is necessary for both techniques to obtain the metabolic profile. This extraction step can be eliminated by making use of high-resolution magic angle spinning (HR-MAS) NMR. In this review, an HR-MAS NMR-based workflow is described in more detail, including used pulse sequences in metabolomics. The pre-processing steps of one-dimensional HR-MAS NMR spectra are presented, including spectral alignment, baseline correction, bucketing, normalisation and scaling procedures. We also highlight some of the models which can be used to perform multivariate analysis on the HR-MAS NMR spectra. Finally, applications of HR-MAS NMR in plant metabolomics are described and show that HR-MAS NMR is a powerful tool for plant metabolomics studies.  相似文献   

7.
High resolution time-of-flight secondary ion mass spectrometry (HR TOF-SIMS) is a powerful surface analytical method. For complex samples, this technique may yield intricate spectra that are difficult to interpret visually. Chemometric methods are useful for data analysis. However, these methods require that spectra are represented in a matrix format. Variances in mass measurements caused by calibration or instrumental effects may present difficulties in properly aligning mass spectral peaks into the correct columns of the data matrix. Cluster analysis of resolution elements is proposed as an alternative approach to construct the data matrix. An automated method for optimizing the data alignment is presented and evaluated for standard steel samples.  相似文献   

8.
The alignment of analytical signals is an important preprocessing step when further analysis (e.g. PCA) requires the same lengths of all of them. Two techniques for alignment of profiles, namely dynamic time warping (DTW) and correlation optimized warping (COW) were tested and compared. The attention was focused on chromatographic and spectroscopic profiles. Simulated and two sets of real data were studied in this study.  相似文献   

9.
Some Mallotus species are used in traditional medicine in Vietnam and China. Some also show interesting activities, such as antioxidant and cytotoxic ones. Combining fingerprint technology with data-handling techniques allows indicating the peaks potentially responsible for given activities. In this study it is aspired to indicate from chromatographic fingerprints the peaks potentially responsible for the antioxidant activity of several Mallotus species. Relevant information was extracted using linear multivariate calibration techniques, both before and after alignment of the fingerprints with correlation optimized warping (COW). From the studied techniques, Stepwise Multiple Linear Regression is least recommended as it made an inadequate variable selection. Principal Component Regression theoretically can take largely varying variables uncorrelated to the antioxidant activity into account. However, in practice in the actual case study this problem was limited. These problems in principle do not occur using Partial Least Squares (PLS) models. Of the tested PLS methods, Orthogonal Projections to Latent Structures was preferred because of its simplicity, reproducibility, reduced model complexity and improved interpretability of the regression coefficients, yielding a clearer view on the individual contribution of the compounds. Furthermore, reducing analysis times from 60 min to 35 and 22.5 min resulted in the same main compounds, indicated responsible for the antioxidant activity. Models built after alignment by COW did not result in additional information.  相似文献   

10.
In the present contribution, a new combination of multivariate curve resolution-correlation optimized warping (MCR-COW) with trilinear parallel factor analysis (PARAFAC) is developed to exploit second-order advantage in complex chromatographic measurements. In MCR-COW, the complexity of the chromatographic data is reduced by arranging the data in a column-wise augmented matrix, analyzing using MCR bilinear model and aligning the resolved elution profiles using COW in a component-wise manner. The aligned chromatographic data is then decomposed using trilinear model of PARAFAC in order to exploit pure chromatographic and spectroscopic information. The performance of this strategy is evaluated using simulated and real high-performance liquid chromatography-diode array detection (HPLC-DAD) datasets. The obtained results showed that the MCR-COW can efficiently correct elution time shifts of target compounds that are completely overlapped by coeluted interferences in complex chromatographic data. In addition, the PARAFAC analysis of aligned chromatographic data has the advantage of unique decomposition of overlapped chromatographic peaks to identify and quantify the target compounds in the presence of interferences. Finally, to confirm the reliability of the proposed strategy, the performance of the MCR-COW-PARAFAC is compared with the frequently used methods of PARAFAC, COW-PARAFAC, multivariate curve resolution-alternating least squares (MCR-ALS), and MCR-COW-MCR. In general, in most of the cases the MCR-COW-PARAFAC showed an improvement in terms of lack of fit (LOF), relative error (RE) and spectral correlation coefficients in comparison to the PARAFAC, COW-PARAFAC, MCR-ALS and MCR-COW-MCR results.  相似文献   

11.
Recent work by Forshed et al. [Anal. Chim. Acta 487 (2003) 189] resulted in an important tool for aligning two NMR spectra. The recognition that the problem posed by Forshed et al. is separable results in fast heuristics that give results that are at least as good as the in an order of magnitude less time. A beam search algorithm is described along with experiments using two different NMR spectrometers and sets of subjects.  相似文献   

12.
High‐performance liquid chromatography coupled with photodiode array detection has been extensively applied in many fields and the peaks among the analyzed samples can be shifted due to the variations of instrumental and experimental conditions. In multivariate analysis, retention time alignment is an important pretreatment step. Hence, the shifted peaks in high‐performance liquid chromatography coupled with photodiode array detection three‐dimensional spectra should be aligned for further analysis. Being motivated by this purpose, the interval correlated shifting method combined with the proposed data arrangement methods are recommended and employed on high‐performance liquid chromatography coupled with photodiode array detection data as a demonstration. We validate the alignment performance of the proposed method through comparison the consistency of the retention time before and after alignment. The obtained results demonstrated that the proposed method is capable of successful aligning the employed data. Additionally, the interval correlated shifting method combined with the data arrangement modes is implemented in an easy‐to‐use graphical user interface environment and so can be operated easily by users not familiar with programming languages.  相似文献   

13.
Parallel factor analysis was used to quantify the relative concentrations of peaks within four-way comprehensive two dimensional liquid chromatography–diode array detector data sets. Since parallel factor analysis requires that the retention times of peaks between each injection are reproducible, a semi-automated alignment method was developed that utilizes the spectra of the compounds to independently align the peaks without the need for a reference injection. Peak alignment is achieved by shifting the optimized chromatographic component profiles from a three-way parallel factor analysis model applied to each injection. To ensure accurate shifting, components are matched up based on their spectral signature and the position of the peak in both chromatographic dimensions. The degree of shift, for each peak, is determined by calculating the distance between the median data point of the respective dimension (in either the second or first chromatographic dimension) and the maximum data point of the peak furthest from the median. All peaks that were matched to this peak are then aligned to this common retention data point. Target analyte recoveries for four simulated data sets were within 2% of 100% recovery in all cases. Two different experimental data sets were also evaluated. Precision of quantification of two spectrally similar and partially coeluting peaks present in urine was as good as or better than 4%. Good results were also obtained for a challenging analysis of phenytoin in waste water effluent, where the results of the semi-automated alignment method agreed with the reference LC–LC MS/MS method within the precision of the methods.  相似文献   

14.
Proton nuclear magnetic resonance (1H NMR) spectroscopic analysis of mixtures has been used extensively for a variety of applications ranging from the analysis of plant extracts, wine, and food to the evaluation of toxicity in animals. For example, NMR analysis of urine samples has been used extensively for biomarker discovery and, more simply, for the construction of classification models of toxicity, disease, and biochemical phenotype. However, NMR spectra of complex mixtures typically show unwanted local peak shifts caused by matrix and instrument variability, which must be compensated for prior to statistical analysis and interpretation of the data. One approach is to align the spectral peaks across the data set. An efficient and fast warping algorithm is required as the signals typically contain ca. 32,000-64,000 data points and there can be several thousand spectra in a data set. As demonstrated in our study, the iterative fuzzy warping algorithm fulfills these requirements and can be used on-line for an alignment of the NMR spectra. Correlation coefficients between the aligned and target spectra are used as the evaluation function for the algorithm, and its performance is compared with those of other published warping methods.  相似文献   

15.
Nuclear magnetic resonance (NMR) analysis of complex samples, such as biofluid samples is accompanied by variations in peak position and peak shape not directly related to the sample. This is due to variations in the background matrix of the sample and to instrumental instabilities. These variations complicate and limit the interpretation and analysis of NMR data by multivariate methods. Alignment of the NMR signals may circumvent these limitations and is an important preprocessing step prior to multivariate analysis. Previous aligning methods reduce the spectral resolution, are very computer-intensive for this kind of data (65k data points in one spectrum), or rely on peak detection. The method presented in this work requires neither data reduction nor preprocessing, e.g. peak detection. The alignment is achieved by taking each segment of the spectrum individually, shifting it sidewise, and linearly interpolating it to stretch or shrink until the best correlation with a corresponding reference spectrum segment is obtained. The segments are automatically picked out with a routine, which avoids cutting in a peak, and the optimization process is accomplished by means of a genetic algorithm (GA). The peak alignment routine is applied to NMR metabonomic data.1  相似文献   

16.
Simpson JV  Oshokoya O  Wagner N  Liu J  JiJi RD 《The Analyst》2011,136(6):1239-1247
The application of UV excitation sources coupled with resonance Raman have the potential to offer information unavailable with the current inventory of commonly used structural techniques including X-ray, NMR and IR analysis. However, for ultraviolet resonance Raman (UVRR) spectroscopy to become a mainstream method for the determination of protein secondary structure content and monitoring protein dynamics, the application of multivariate data analysis methodologies must be made routine. Typically, the application of higher order data analysis methods requires robust pre-processing methods in order to standardize the data arrays. The application of such methods can be problematic in UVRR datasets due to spectral shifts arising from day-to-day fluctuations in the instrument response. Additionally, the non-linear increases in spectral resolution in wavenumbers (increasing spectral data points for the same spectral region) that results from increasing excitation wavelengths can make the alignment of multi-excitation datasets problematic. Last, a uniform and standardized methodology for the subtraction of the water band has also been a systematic issue for multivariate data analysis as the water band overlaps the amide I mode. Here we present a two-pronged preprocessing approach using correlation optimized warping (COW) to alleviate spectra-to-spectra and day-to-day alignment errors coupled with a method whereby the relative intensity of the water band is determined through a least-squares determination of the signal intensity between 1750 and 1900 cm(-1) to make complex multi-excitation datasets more homogeneous and usable with multivariate analysis methods.  相似文献   

17.
One of the largest challenges in high performance liquid chromatography (HPLC) method development is the necessity for tracking the movement of peaks as separation conditions are changed. Peak increments are then used to build a mathematical model capable of minimizing the number of experiments in an optimization circuit. Method optimization for an unknown mixture is, moreover, complicated by the absence of any a priori information on component properties and retention times when direct signal assignment is not possible. On the contrary, achievement of the maximum separation becomes an important factor for successful identification or quantitation. In this case, the optimization may be based on assigning peaks of the same component chosen from different experiments to each other. In other words, mutual peak matching between the HPLC runs is required.

A new method for mutual peak matching in a series of HPLC with diode array detector (HPLC–DAD) analyses of the same unknown mixture acquired at varying separation conditions has been developed. This approach, called mutual automated peak matching (MAP), does not require any prior knowledge of the mixture composition. Applying abstract factor analysis (AFA) and iterative key set factor analysis (IKSFA) on the augmented data matrix, the algorithm detects the number of mixture components and calculates the retention times of every individual compound in each of the input chromatograms. Every candidate component is then validated by target testing for presence in each HPLC run to provide quantitative criteria for the detection of “missing” peaks and non-analyte components as well as confirming successful matches. The matching algorithm by itself does not perform full curve resolution. However, its output may serve as a good initial estimate for further modeling. A common set of UV-Vis spectra of pure components can be obtained, as well as their corresponding concentration profiles in separate runs, by means of alternating least-square multivariate curve resolution (ALS MCR), resulting in reconstruction of overlapped peaks.

The algorithms were programmed in MATLAB® and tested on a number of sets of simulated data. Possible ways to improve the stability of results, reduce calculation time, and minimize operator interaction are discussed. The technique can be used to optimize HPLC analysis of a complex mixture without preliminary identification of its components.  相似文献   


18.
Most in vivo 31P MR studies are realized on 3T MR systems that provide sufficient signal intensity for prominent phosphorus metabolites. The identification of these metabolites in the in vivo spectra is performed by comparing their chemical shifts with the chemical shifts measured in vitro on high-field NMR spectrometers. To approach in vivo conditions at 3T, a set of phantoms with defined metabolite solutions were measured in a 3T whole-body MR system at 7.0 and 7.5 pH, at 37 °C. A free induction decay (FID) sequence with and without 1H decoupling was used. Chemical shifts were obtained of phosphoenolpyruvate (PEP), phosphatidylcholine (PtdC), phosphocholine (PC), phosphoethanolamine (PE), glycerophosphocholine (GPC), glycerophosphoetanolamine (GPE), uridine diphosphoglucose (UDPG), glucose-6-phosphate (G6P), glucose-1-phosphate (G1P), 2,3-diphosphoglycerate (2,3-DPG), nicotinamide adenine dinucleotide (NADH and NAD+), phosphocreatine (PCr), adenosine triphosphate (ATP), adenosine diphosphate (ADP), and inorganic phosphate (Pi). The measured chemical shifts were used to construct a basis set of 31P MR spectra for the evaluation of 31P in vivo spectra of muscle and the liver using LCModel software (linear combination model). Prior knowledge was successfully employed in the analysis of previously acquired in vivo data.  相似文献   

19.
A rapid retention time alignment algorithm was developed as a preprocessing utility to be used prior to chemometric analysis of large datasets of diesel fuel profiles obtained using gas chromatography (GC). Retention time variation from chromatogram-to-chromatogram has been a significant impediment against the use of chemometric techniques in the analysis of chromatographic data due to the inability of current chemometric techniques to correctly model information that shifts from variable to variable within a dataset. The alignment algorithm developed is shown to increase the efficacy of pattern recognition methods applied to diesel fuel chromatograms by retaining chemical selectivity while reducing chromatogram-to-chromatogram retention time variations and to do so on a time scale that makes analysis of large sets of chromatographic data practical. Two sets of diesel fuel gas chromatograms were studied using the novel alignment algorithm followed by principal component analysis (PCA). In the first study, retention times for corresponding chromatographic peaks in 60 chromatograms varied by as much as 300 ms between chromatograms before alignment. In the second study of 42 chromatograms, the retention time shifting exhibited was on the order of 10 s between corresponding chromatographic peaks, and required a coarse retention time correction prior to alignment with the algorithm. In both cases, an increase in retention time precision afforded by the algorithm was clearly visible in plots of overlaid chromatograms before and then after applying the retention time alignment algorithm. Using the alignment algorithm, the standard deviation for corresponding peak retention times following alignment was 17 ms throughout a given chromatogram, corresponding to a relative standard deviation of 0.003% at an average retention time of 8 min. This level of retention time precision is a 5-fold improvement over the retention time precision initially provided by a state-of-the-art GC instrument equipped with electronic pressure control and was critical to the performance of the chemometric analysis. This increase in retention time precision does not come at the expense of chemical selectivity, since the PCA results suggest that essentially all of the chemical selectivity is preserved. Cluster resolution between dissimilar groups of diesel fuel chromatograms in a two-dimensional scores space generated with PCA is shown to substantially increase after alignment. The alignment method is robust against missing or extra peaks relative to a target chromatogram used in the alignment, and operates at high speed, requiring roughly 1 s of computation time per GC chromatogram.  相似文献   

20.
Coffee samples were analyzed by GC/MS in order to determine the most important peaks for the discrimination of the varieties Arabica and Robusta. The resulting peak tables from chromatographic analysis were aligned and pretreated before being submitted to multivariate analysis. A rapid and easy-to-perform peak alignment procedure, which does not require advanced programming skills to use, was compared with the tedious manual alignment procedure. The influence of three types of data pretreatment, normalization, logarithmic and square root transformations and their combinations, on the variables selected as most important by the regression coefficients of partial least squares-discriminant analysis (PLS-DA), are shown. Test samples different from those used in the calibration and comparison with the substances already known as being responsible for Arabica and Robusta coffees discrimination were used to determine the best pretreatments for both datasets. The data pretreatment consisting of square root transformation followed by normalization (RN) was chosen as being the most appropriate. The results obtained showed that the much quicker automated aligned method could be used as a substitute for the manually aligned method, allowing all the peaks in the chromatogram to be used for multivariate analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号