首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a new preprocessing method, PeakSelect, to improve the accuracy and efficiency of Tandem Mass-Spec peptide (protein) identification. The fundamental difference between noise and fragment ions in spectra is that ions have isotopes but noise does not. We propose a new and important concept of an Isotope Pattern Vector (IPV) which characterizes the isotope cluster of fragment ions. Then the noise and real peaks can be distinguished by the quantitative IPV values. PeakSelect first uses a new method of the Gaussian Mixture Model and Expectation-Maximization (EM) algorithm to find the base intensity level (baseline) in a spectrum. Then PeakSelect selects features based on the IPV and baseline, and constructs a decision tree to automatically classify the peaks into different categories such as noise, single ion peaks, and overlapping peaks. Experiments show that PeakSelect can help to reduce the Mascot searching time and increase the reliability of peptide identifications. In particular, PeakSelect performs well on complex spectra with a large number of peaks from large peptides, and supports more sequence identification than other well-known systems.  相似文献   

2.
We report an algorithm designed for the calibration of low resolution peptide mass spectra. Our algorithm is implemented in a program called FineTune, which corrects systematic mass measurement error in 1?min, with no input required besides the mass spectra themselves. The mass measurement accuracy for a set of spectra collected on an LTQ-Velos improved 20-fold from ?C0.1776?±?0.0010?m/z to 0.0078?±?0.0006?m/z after calibration (avg?±?95?% confidence interval). The precision in mass measurement was improved due to the correction of non-linear variation in mass measurement accuracy across the m/z range.  相似文献   

3.
A 1H NMR method for the quantification of dermatan sulfate impurities in heparin industrial samples is proposed. The method is based on the analysis of 1H NMR spectral data by multivariate calibration. The 1H NMR spectra of heparin and dermatan sulfate standards showed characteristic profiles. Thus, differences in the methyl peaks of acetamido groups of heparin and dermatan sulfate were greatly advantageous for the analysis. Other hydrogens of the sugar ring were also relevant in this study. Thus, the determination of dermatan sulfate by multivariate calibration depended on all these differences. Partial least squares regression (PLS) was chosen as the calibration method. In addition, a data standardization procedure was developed in order that 1H NMR spectra registered with different instruments operating under different measurement conditions were comparable. The quantification of dermatan sulfate in the samples was satisfactory, with an overall prediction error of 6%.  相似文献   

4.
Accurately measured peptide masses can be used for large-scale protein identification from bacterial whole-cell digests as an alternative to tandem mass spectrometry (MS/MS) provided mass measurement errors of a few parts-per-million (ppm) are obtained. Fourier transform ion cyclotron resonance (FTICR) mass spectrometry (MS) routinely achieves such mass accuracy either with internal calibration or by regulating the charge in the analyzer cell. We have developed a novel and automated method for internal calibration of liquid chromatography (LC)/FTICR data from whole-cell digests using peptides in the sample identified by concurrent MS/MS together with ambient polydimethylcyclosiloxanes as internal calibrants in the mass spectra. The method reduced mass measurement error from 4.3 +/- 3.7 ppm to 0.3 +/- 2.3 ppm in an E. coli LC/FTICR dataset of 1000 MS and MS/MS spectra and is applicable to all analyses of complex protein digests by FTICRMS.  相似文献   

5.
一种消除在线多通道近红外分析仪各通道光谱差异的方法   总被引:1,自引:0,他引:1  
针对在线多通道近红外分析仪因光纤耦合器件加工精度和装配过程存在细微差异而引起通道间光谱不一致的问题,在对光谱差异进行解析的基础上,提出了一种运算简捷、且在实际应用中易于实现的平均光谱差值校正(MSSC)方法,并与常用的模型传递算法如斜率/偏差(S/B)算法、分段直接校正(PDS)算法,以及通过偏最小二乘-人工神经网络(PLS-ANN)建立多通道混合校正模型进行了对比。结果表明,该方法可有效消除各通道所测光谱之间存在的差异,实现了多通道分析模型的通用性。  相似文献   

6.
We have obtained relationships for frequency shifts resulting from the interference of spectral components for the magnitude mode Fourier transform. The approximation of a weak perturbation of well resolved peaks has been used. Both the low- and high-pressure limits for Fourier-transform ion cyclotron resonance (FTICR) operation have been considered. We have found that the shifts can be either negative or positive, depending on the initial phase and/or the choice of the time-domain interval. The magnitude of shifts generally does not exceed the peak width. In the approximation of small perturbations the shifts produced by multiple peaks are additive. We have compared theoretical results with experimental shifts for isotopic clusters of multiply charged insulin. Up to 1 ppm frequency variations were experimentally observed for the insulin 5+ charge state, consistent with theoretical estimates. The peak interference is of particular significance in the case of bio-molecular mass spectra having a large number of peaks and covering a considerable dynamic range (i.e., relative abundance). We conclude that the common mass measurement procedure based on the location of the magnitude mode maxima of well resolved peaks can result in systematic mass measurement errors. The relationships obtained provide corrections for the frequency shifts and thus improve the mass measurement accuracy.  相似文献   

7.
Rodrigues LO  Cardoso JP  Menezes JC 《Talanta》2008,75(5):1203-1207
The use of near infrared spectroscopy (NIRS) in downstream solvent based processing steps of an active pharmaceutical ingredient (API) is reported. A single quantitative method was developed for API content assessment in the organic phase of a liquid–liquid extraction process and in multiple process streams of subsequent concentration and depuration steps. A new methodology based in spectra combinations and variable selection by genetic algorithm was used with an effective improvement in calibration model prediction ability. Root mean standard error of prediction (RMSEP) of 0.05 in the range of 0.20–3.00% (w/w) was achieved. With this method, it is possible to balance the calibration data set with spectra of desired concentrations, whenever acquisition of new spectra is no longer possible or improvements in model's accuracy for a specific selected range are necessary. The inclusion of artificial spectra prior to genetic algorithms use improved RMSEP by 10%. This method gave a relative RMSEP improvement of 46% compared with a standard PLS of full spectral length.  相似文献   

8.
Shotgun proteomics experiments require the collection of thousands of tandem mass spectra; these sets of data will continue to grow as new instruments become available that can scan at even higher rates. Such data contain substantial amounts of redundancy with spectra from a particular peptide being acquired many times during a single LC-MS/MS experiment. In this article, we present MS2Grouper, an algorithm that detects spectral duplication, assesses groups of related spectra, and replaces these groups with synthetic representative spectra. Errors in detecting spectral similarity are corrected using a paraclique criterion-spectra are only assessed as groups if they are part of a clique of at least three completely interrelated spectra or are subsequently added to such cliques by being similar to all but one of the clique members. A greedy algorithm constructs a representative spectrum for each group by iteratively removing the tallest peaks from the spectral collection and matching to peaks in the other spectra. This strategy is shown to be effective in reducing spectral counts by up to 20% in LC-MS/MS datasets from protein standard mixtures and proteomes, reducing database search times without a concomitant reduction in identified peptides.  相似文献   

9.
Elemental composition determination of volatile organic compounds through high mass accuracy and isotope pattern matching could not be routinely achieved with a unit-mass resolution mass spectrometer until the recent development of the comprehensive instrument line-shape calibration technology. Through this unique technology, both m/z values and mass spectral peak shapes are calibrated simultaneously. Of fundamental importance is that calibrated mass spectra have symmetric and mathematically known peak shapes, which makes it possible to deconvolute overlapped monoisotopes and their (13)C-isotope peaks and achieve accurate mass measurements. The key experimental requirements for the measurements are to acquire true raw data in a profile or continuum mode with the acquisition threshold set to zero. A total of 13 ions from Chinese rose oil were analyzed with internal calibration. Most of the ions produced high mass accuracy of better than 5 mDa and high spectral accuracy of better than 99%. These results allow five tested ions to be identified with unique elemental compositions and the other eight ions to be determined as a top match from multiple candidates based on spectral accuracy. One of them, a coeluted component (Nerol) with m/z 154, could not be identified by conventional GC/MS (gas chromatography/mass spectrometry) and library search. Such effective determination for elemental compositions of the volatile organic compounds with a unit-mass resolution quadrupole system is obviously attributed to the significant improvement of mass accuracy. More importantly, high spectral accuracy available through the instrument line-shape calibration enables highly accurate isotope pattern recognition for unknown identification.  相似文献   

10.
High mass measurement accuracy of peptides in enzymatic digests is critical for confident protein identification and characterization in proteomics research. Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) can provide low or sub-ppm mass accuracy and ultrahigh resolving power. While for ESI-FT-ICR-MS, the mass accuracy is generally 1 ppm or better, with matrix-assisted laser desorption/ionization (MALDI)-FT-ICR-MS, the mass errors can vary from sub-ppm with internal calibration to over 100 ppm with conventional external calibration. A novel calibration method for (15)N-metabolically labeled peptides from a batch digest of a proteome is described which corrects for space charge induced frequency shifts in FT-ICR spectra without using an internal calibrant. This strategy utilizes the information from the mass difference between the (14)N/(15)N peptide peak pairs to correct for space charge induced mass shifts after data collection. A procedure for performing the mass correction has been written into a computer program and has been successfully applied to high-performance liquid chromatography-MALDI-FT- ICR-MS measurement of (15)N-metabolic labeled proteomes. We have achieved an average measured mass error of 1.0 ppm and a standard deviation of 3.5 ppm for 900 peptides from 68 MALDI-FT-ICR mass spectra of the proteolytic digest of a proteome from Methanococcus maripaludis.  相似文献   

11.
Mass spectrometry imaging by Fourier transform ion cyclotron resonance (FT-ICR) yields hundreds of unique peaks, many of which cannot be resolved by lower performance mass spectrometers. The high mass accuracy and high mass resolving power allow confident identification of small molecules and lipids directly from biological tissue sections. Here, calibration strategies for FT-ICR MS imaging were investigated. Sub-parts-per-million mass accuracy is demonstrated over an entire tissue section. Ion abundance fluctuations are corrected by addition of total and relative ion abundances for a root-mean-square error of 0.158?ppm on 16,764 peaks. A new approach for visualization of FT-ICR MS imaging data at high resolution is presented. The ??Mosaic Datacube?? provides a flexible means to visualize the entire mass range at a mass spectral bin width of 0.001?Da. The high resolution Mosaic Datacube resolves spectral features not visible at lower bin widths, while retaining the high mass accuracy from the calibration methods discussed.  相似文献   

12.
Mass accuracy is a key parameter in proteomic experiments, improving specificity, and success rates of peptide identification. Advances in instrumentation now make it possible to routinely obtain high resolution data in proteomic experiments. To compensate for drifts in instrument calibration, a compound of known mass is often employed. This ‘lock mass’ provides an internal mass standard in every spectrum. Here we take advantage of the complexity of typical peptide mixtures in proteomics to eliminate the requirement for a physical lock mass. We find that mass scale drift is primarily a function of the m/z and the elution time dimensions. Using a subset of high confidence peptide identifications from a first pass database search, which effectively substitute for the lock mass, we set up a global mathematical minimization problem. We perform a simultaneous fit in two dimensions using a function whose parameterization is automatically adjusted to the complexity of the analyzed peptide mixture. Mass deviation of the high confidence peptides from their calculated values is then minimized globally as a function of both m/z value and elution time. The resulting recalibration function performs equal or better than adding a lock mass from laboratory air to LTQ-Orbitrap spectra. This ‘software lock mass’ drastically improves mass accuracy compared with mass measurement without lock mass (up to 10-fold), with none of the experimental cost of a physical lock mass, and it integrated into the freely available MaxQuant analysis pipeline ().  相似文献   

13.
High throughput identification of proteins by peptide mass fingerprinting requires an efficient means of picking peaks from mass spectra. Here, we report the development of a peak harvester to automatically pick monoisotopic peaks from spectra generated on matrix-assisted laser desorption/ionisation time of flight (MALDI-TOF) mass spectrometers. The peak harvester uses advanced mathematical morphology and watershed algorithms to first process spectra to stick representations. Subsequently, Poisson modelling is applied to determine which peak in an isotopically resolved group represents the monoisotopic mass of a peptide. We illustrate the features of the peak harvester with mass spectra of standard peptides, digests of gel-separated bovine serum albumin, and with Escherictia coli proteins prepared by two-dimensional polyacrylamide gel electrophoresis. In all cases, the peak harvester proved effective in its ability to pick similar monoisotopic peaks as an experienced human operator, and also proved effective in the identification of monoisotopic masses in cases where isotopic distributions of peptides were overlapping. The peak harvester can be operated in an interactive mode, or can be completely automated and linked through to peptide mass fingerprinting protein identification tools to achieve high throughput automated protein identification.  相似文献   

14.
In quantitative on-line/in-line monitoring of chemical and bio-chemical processes using spectroscopic instruments, multivariate calibration models are indispensable for the extraction of chemical information from complex spectroscopic measurements. The development of reliable multivariate calibration models is generally time-consuming and costly. Therefore, once a reliable multivariate calibration model is established, it is expected to be used for an extended period. However, any change in the instrumental response or variations in the measurement conditions can render a multivariate calibration model invalid. In this contribution, a new method, spectral space transformation (SST), has been developed to maintain the predictive abilities of multivariate calibration models when the spectrometer or measurement conditions are altered. SST tries to eliminate the spectral differences induced by the changes in instruments or measurement conditions through the transformation between two spectral spaces spanned by the corresponding spectra of a subset of standardization samples measured on two instruments or under two sets of experimental conditions. The performance of the method has been tested on two data sets comprising NIR and MIR spectra. The experimental results show that SST can achieve satisfactory analyte predictions from spectroscopic measurements subject to spectrometer/probe alteration, when only a few standardization samples are used. Compared with the existing popular methods designed for the same purpose, i.e. global PLS, univariate slope and bias correction (SBC) and piecewise direct standardization (PDS), SST has the advantages of implementation simplicity, wider applicability and better performance in terms of predictive accuracy.  相似文献   

15.
The objective of this work was investigation of possibility of tunable diode laser spectroscopy (TDLS) technique application for gaseous uranium hexafluoride (UF6) isotope measurement. Spectra of uranium hexafluoride gas mixture were investigated using two different Fourier Transform Spectrometers Vector 22 and Bruker 66v. Observed spectral features were identified and model spectra of different gas mixture components were developed. Optimal spectral range for measurements was determined near maximum of UF6 combination band nu1+nu3. Laboratory prototype of multi-channel instrument under consideration based on tunable diode lasers was built and algorithms were developed to measure gaseous UF6 isotopic ratios. Diode laser used operated at the wavelengths near lambda=7.68 microm. It was placed in a liquid nitrogen cooled cryostat. Three instrument channels were used for laser frequency calibration and spectra recording. Instrument was tested in measurements of real UF6 gas mixtures. Measurement accuracy was analyzed and error sources were identified. The root-mean-square random error in the 235U isotopic content is characterized by a spread of about 0.27% for quick measurements (at times less than 1 min) and 1% for periods of more than an hour. It was estimated that the measurement accuracy could be improved by at least an order of magnitude by minimizing the error sources.  相似文献   

16.
A partial least squares (PLS) and wavelet transform hybrid model are proposed to analyze the carbon content of coal by using laser-induced breakdown spectroscopy (LIBS). The hybrid model is composed of two steps of wavelet analysis procedures, which include environmental denoising and background noise reduction, to pretreat the LIBS spectrum. The processed wavelet coefficients, which contain the discrete line information of the spectra, were taken as inputs for the PLS model for calibration and prediction of carbon element. A higher signal-to-noise ratio of carbon line was obtained after environmental denoising, and the best decomposition level was determined after background noise reduction. The hybrid model resulted in a significant improvement over the conventional PLS method under different ambient environments, which include air, argon, and helium. The average relative error of carbon decreased from 2.74 to 1.67% under an ambient helium environment, which indicated a significantly improved accuracy in the measurement of carbon in coal. The best results obtained under an ambient helium environment could be partly attributed to the smallest interference by noise after wavelet denoising. A similar improvement was observed in ambient air and argon environments, thereby proving the applicability of the hybrid model under different experimental conditions.  相似文献   

17.
《Analytical letters》2012,45(2):340-348
Synchronous 2D correlation spectroscopy was first proposed to select informational spectral intervals in PLS calibration. The proposed method could extract the spectral intervals related to analyte. The results of its application to NIR/PLS determination of quercetin in extract of Ginkgo biloba leaves showed that the proposed method could find out an optimized region with which one could improve the performance of the corresponding PLS model, in terms of low prediction error, root mean square error of prediction (RMSEP), and comparing with the result obtained using whole spectra and interval PLS.  相似文献   

18.
An application of the multivariate calibration technique of partial least-squares (PLS) regression to near-infrared spectra of a fiber-optic sensor based on the evanescent wave principle is presented. The sensing element consists of a quartz glass fiber with a silicone cladding which enriches nonpolar water contaminants. Due to the interaction of the extracted molecules with the part of the light which is transmitted in the evanescent wave zone of the cladding, absorbance spectra of the contaminants can be collected. In view of a sensor application for in-situ environmental analysis, aqueous solutions of chlorinated hydrocarbon solvents (CHS), which often can be found as major water contaminants, have been measured. PLS regression was applied to three sets of CHS samples, representing typical features of NIR evanescent wave spectral data. These are, e.g., strong overlapping of the absorption bands of different CHS components, peak distortions due to temperature variations between reference and sample measurement and noisy data at analyte concentrations near to the limit of detection, respectively. For trichloroethene and 1,1-dichloroethene, where the calibration model was built for samples within a small concentration range of 1–9 mg l–1, satisfactory prediction results could be obtained with a relatively small root-mean-square error of 0.3 mg l–1 compared to analytical reference measurements. In contrast to this, for a three component system of dichloromethane, trichloromethane and trichloroethene with strongly overlapping absorption bands, where samples over a very broad concentration range from 3–4940 mg l–1 were included in the PLS model, the prediction accuracy decreased enormously and for some samples strong deviations between real and predicted data occurred. Nevertheless, applying multivariate calibration to this difficult system with similar spectral features and huge differences in the concentration of the species allowed an acceptable spectral distinction and at least a semi-quantitative determination of the CHS species.  相似文献   

19.
Recent developments in proteomics have revealed a bottleneck in bioinformatics: high-quality interpretation of acquired MS data. The ability to generate thousands of MS spectra per day, and the demand for this, makes manual methods inadequate for analysis and underlines the need to transfer the advanced capabilities of an expert human user into sophisticated MS interpretation algorithms. The identification rate in current high-throughput proteomics studies is not only a matter of instrumentation. We present software for high-throughput PMF identification, which enables robust and confident protein identification at higher rates. This has been achieved by automated calibration, peak rejection, and use of a meta search approach which employs various PMF search engines. The automatic calibration consists of a dynamic, spectral information-dependent algorithm, which combines various known calibration methods and iteratively establishes an optimised calibration. The peak rejection algorithm filters signals that are unrelated to the analysed protein by use of automatically generated and dataset-dependent exclusion lists. In the "meta search" several known PMF search engines are triggered and their results are merged by use of a meta score. The significance of the meta score was assessed by simulation of PMF identification with 10,000 artificial spectra resembling a data situation close to the measured dataset. By means of this simulation the meta score is linked to expectation values as a statistical measure. The presented software is part of the proteome database ProteinScape which links the information derived from MS data to other relevant proteomics data. We demonstrate the performance of the presented system with MS data from 1891 PMF spectra. As a result of automatic calibration and peak rejection the identification rate increased from 6% to 44%.Abbreviations 2-DE Two-dimensional gel electrophoresis - MALDI Matrix-assisted laser desorption ionisation - PMF Peptide mass fingerprinting - MS Mass spectrometry - TOF Time of flight  相似文献   

20.
A comprehensive investigation was performed to understand the influence of sequence scrambling in peptide ions on peptide identification results. To achieve this, four tandem mass spectrometry datasets with scrambled ions included and with them excluded were analyzed by Crux, X!Tandem, SpectraST, Lutefisk, and PepNovo. While the different algorithms differed in their performance, an increase in the number of correctly identified peptides was generally observed when removing scrambled ions, with the exception of the SpectraST algorithm. However, the variation of the match scores upon removal was unpredictable. Following these investigations, an interpretation was given on how the scrambled ions affect peptide identification. Lastly, a simulated theoretical mass spectral library derived from the NIST peptide Libraries was constructed and searched by SpectraST to study whether scrambled ions in predicted mass spectra could affect peptide identification. Consistent with the peptide library search results, no significant variations for dot product scores as well as peptide identification results were observed when these ions were included in the theoretical MS/MS spectra. From the five adopted algorithms, the SpectraST and Crux provided the most robust results, whereas X!Tandem, PepNovo, and Lutefisk were sensitive to the existence of the scrambled ions, especially the latter two de novo sequencing algorithms.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号