首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
Massive amounts of tandem mass spectra are produced in high-throughput proteomics studies. The manual interpretation of these spectra is not feasible. Instead, search engines are used to match the tandem mass spectra with sequence information contained in proteomics and genomics databases. Typically, these search engines provide a list of the best matching peptide sequences for an individual tandem mass spectrum. As well, they provide scores that are somewhat related to the confidence level in the match. Many peptide tandem mass spectra search engines have been reported. These search engines provide very different results depending on the type of mass spectrometers used and their input parameters. Here we describe a comparative analysis of different search engines using validated test sets of tandem mass spectra. We have defined test sets of MS/MS spectra derived from high throughput proteomics experiments performed by HPLC-ESI-MS/MS on ion trap (LCQ) and tandem quadrupole time-of-flight instruments with a pulsar functionality (Qstar Pulsar) mass spectrometers. We analyzed the ability of the different search engines to identify the correct peptides, and the cross-validations of the different search engines.  相似文献   

2.
In high-throughput proteomics, the bottom-up approach has become a widely used method for the identification of proteins that is based on tryptic peptide MS/MS analysis. Separation methodologies that use IEF of tryptic peptides have recently been introduced and provide an extra dimension of peptide separation. In addition to its great fractionation capability, tryptic peptide prefractionation by IEF can also increase the protein identification success. The pI information of the peptide gained can be successfully used in a post-database search filtering step. We introduce a filtering algorithm that is based on the comparison of the experimental and theoretical pI's to validate peptide identifications by MS/MS data search engines.  相似文献   

3.
Recent developments in proteomics have revealed a bottleneck in bioinformatics: high-quality interpretation of acquired MS data. The ability to generate thousands of MS spectra per day, and the demand for this, makes manual methods inadequate for analysis and underlines the need to transfer the advanced capabilities of an expert human user into sophisticated MS interpretation algorithms. The identification rate in current high-throughput proteomics studies is not only a matter of instrumentation. We present software for high-throughput PMF identification, which enables robust and confident protein identification at higher rates. This has been achieved by automated calibration, peak rejection, and use of a meta search approach which employs various PMF search engines. The automatic calibration consists of a dynamic, spectral information-dependent algorithm, which combines various known calibration methods and iteratively establishes an optimised calibration. The peak rejection algorithm filters signals that are unrelated to the analysed protein by use of automatically generated and dataset-dependent exclusion lists. In the "meta search" several known PMF search engines are triggered and their results are merged by use of a meta score. The significance of the meta score was assessed by simulation of PMF identification with 10,000 artificial spectra resembling a data situation close to the measured dataset. By means of this simulation the meta score is linked to expectation values as a statistical measure. The presented software is part of the proteome database ProteinScape which links the information derived from MS data to other relevant proteomics data. We demonstrate the performance of the presented system with MS data from 1891 PMF spectra. As a result of automatic calibration and peak rejection the identification rate increased from 6% to 44%.Abbreviations 2-DE Two-dimensional gel electrophoresis - MALDI Matrix-assisted laser desorption ionisation - PMF Peptide mass fingerprinting - MS Mass spectrometry - TOF Time of flight  相似文献   

4.
Database searching is the technique of choice for shotgun proteomics, and to date much research effort has been spent on improving its effectiveness. However, database searching faces a serious challenge of efficiency, considering the large numbers of mass spectra and the ever fast increase in peptide databases resulting from genome translations, enzymatic digestions, and post‐translational modifications. In this study, we conducted systematic research on speeding up database search engines for protein identification and illustrate the key points with the specific design of the pFind 2.1 search engine as a running example. Firstly, by constructing peptide indexes, pFind achieves a speedup of two to three compared with that without peptide indexes. Secondly, by constructing indexes for observed precursor and fragment ions, pFind achieves another speedup of two. As a result, pFind compares very favorably with predominant search engines such as Mascot, SEQUEST and X!Tandem. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

5.
Liquid chromatography-mass spectrometry (LC-MS) datasets can be compared or combined following chromatographic alignment. Here we describe a simple solution to the specific problem of aligning one LC-MS dataset and one LC-MS/MS dataset, acquired on separate instruments from an enzymatic digest of a protein mixture, using feature extraction and a genetic algorithm. First, the LC-MS dataset is searched within a few ppm of the calculated theoretical masses of peptides confidently identified by LC-MS/MS. A piecewise linear function is then fitted to these matched peptides using a genetic algorithm with a fitness function that is insensitive to incorrect matches but sufficiently flexible to adapt to the discrete shifts common when comparing LC datasets. We demonstrate the utility of this method by aligning ion trap LC-MS/MS data with accurate LC-MS data from an FTICR mass spectrometer and show how hybrid datasets can improve peptide and protein identification by combining the speed of the ion trap with the mass accuracy of the FTICR, similar to using a hybrid ion trap-FTICR instrument. We also show that the high resolving power of FTICR can improve precision and linear dynamic range in quantitative proteomics. The alignment software, msalign, is freely available as open source.  相似文献   

6.
Owing to its labile nature, a new role for cysteine sulfenic acid (–SOH) modification has emerged. This oxidative modification modulates protein function by acting as a redox switch during cellular signaling. The identification of proteins that undergo this modification represents a methodological challenge, and its resolution remains a matter of current interest. The development of strategies to chemically modify cysteinyl‐containing peptides for liquid chromatography–tandem mass spectrometry (LC‐MS/MS) analysis has increased significantly within the past decade. The method of choice to selectively label sulfenic acid is based on the use of dimedone or its derivatives. For these chemical probes to be effective on a proteome‐wide level, their reactivity toward –SOH must be high to ensure reaction completion. In addition, the presence of an adduct should not interfere with electrospray ionization, the efficiency of induced dissociation in MS/MS experiments or with the identification of Cys‐modified peptides by automated database searching algorithms. Herein, we employ a targeted proteomics approach to study the electrospray ionization and fragmentation effects of different –SOH specific probes and compared them to commonly used alkylating agents. We then extend our study to a whole proteome extract using shotgun proteomic approaches. These experiments enable us to demonstrate that dimedone adducts do not interfere with electrospray by suppressing the ionization nor impede product ion assignment by automated search engines, which detect a + 138 Da increase from unmodified peptides. Collectively, these results suggest that dimedone can be a powerful tool to identify sulfenic acid modifications by high‐throughput shotgun proteomics of a whole proteome. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

7.
蛋白质定量是探索疾病发生发展状况和寻找新药靶标的重要手段。在shotgun蛋白组学中,目前常用定量方法包括综合同位素标记后的质谱峰强度方法和无标记定量方法。根据数据类型无标记定量方法可以分为两类:基于鉴定蛋白的质谱数的方法和基于质谱峰强度的方法。本研究主要用EM算法改进基于鉴定蛋白质谱数的定量方法,并用免疫印迹实验获得的酵母全蛋白的丰度来验证EM算法改进后定量的有效性结果表明,改进后的质谱数和蛋白丰度的相关性比改进前有一定的提高。同时,利用这些数据对主要的几种基于鉴定蛋白的质谱数的模型进行了比较,发现PAI模型最好,SpS模型次之,emPAI模型最不适合于蛋白质定量。  相似文献   

8.
Shotgun proteomics experiments require the collection of thousands of tandem mass spectra; these sets of data will continue to grow as new instruments become available that can scan at even higher rates. Such data contain substantial amounts of redundancy with spectra from a particular peptide being acquired many times during a single LC-MS/MS experiment. In this article, we present MS2Grouper, an algorithm that detects spectral duplication, assesses groups of related spectra, and replaces these groups with synthetic representative spectra. Errors in detecting spectral similarity are corrected using a paraclique criterion-spectra are only assessed as groups if they are part of a clique of at least three completely interrelated spectra or are subsequently added to such cliques by being similar to all but one of the clique members. A greedy algorithm constructs a representative spectrum for each group by iteratively removing the tallest peaks from the spectral collection and matching to peaks in the other spectra. This strategy is shown to be effective in reducing spectral counts by up to 20% in LC-MS/MS datasets from protein standard mixtures and proteomes, reducing database search times without a concomitant reduction in identified peptides.  相似文献   

9.
Electrospray ionization ion trap mass spectrometry (ESI-ITMS) coupled to a two-dimensional liquid chromatographic separation was applied to the identification of peptides in antimicrobial fractions of the aqueous extracts of nine Italian cheese varieties. In particular, the chromatographic fractions collected during a preliminary fast protein liquid chromatography (FPLC) separation on the cheese extracts were assayed for antimicrobial activity towards Lactobacillus sakei A15. Active fractions were subsequently analyzed by reversed-phase high-performance liquid chromatography electrospray ionization sequential mass spectrometry (HPLC/ESI)-ITMSn, with n up to 3. Peptide identification was then performed starting from a conventional proteomics approach based on tandem mass spectrometric (MS/MS) analysis followed by database searching. In many cases this strategy had to be integrated by a careful correlation between spectral information and predicted peptide fragmentation, in order to reach unambiguous identifications. When even this integrated approach failed, MS3 measurements provided decisive information on the amino acid sequence of some peptides, through fragmentation of pendant groups along the peptide chain. As a result, 45 peptides, all arising from hydrolysis of milk caseins, were identified in nine antimicrobial FPLC fractions of aqueous extracts obtained from five of the nine cheese varieties considered. Many of them corresponded to peptides already known to exhibit biological activity.  相似文献   

10.
It has been observed that a modified peptide and its non-modified counterpart, when analyzed with reverse phase liquid chromatography, usually share a very similar elution property [1–3]. Inasmuch as this property is common to many different types of protein modifications, we propose an informatics-based approach, featuring the generation of segmental average mass spectra (saMS), that is capable of locating different types of modified peptides in two-dimensional liquid chromatography–mass spectrometric (LC–MS) data collected for regular protease digests from proteins in gels or solutions. To enable the localization of these peptides in the LC–MS map, we have implemented a set of computer programs, or the saMS package, that perform the needed functions, including generating a complete set of segmental average mass spectra, compiling the peptide inventory from the Sequest/TurboSequest results, searching modified peptide candidates and annotating a tandem mass spectrum for final verification. Using ROCK2 as an example, our programs were applied to identify multiple types of modified peptides, such as phosphorylated and hexosylated ones, which particularly include those peptides that could have been ignored due to their peculiar fragmentation patterns and consequent low search scores. Hence, we demonstrate that, when complemented with peptide search algorithms, our approach and the entailed computer programs can add the sequence information needed for bolstering the confidence of data interpretation by the present analytical platforms and facilitate the mining of protein modification information out of complicated LC–MS/MS data.  相似文献   

11.
The quantity and variable quality of data that can be generated from liquid chromatography (LC)/mass spectrometry (MS)-based proteomics analyses creates many challenges in interpreting the spectra in terms of the actual proteins in a complex sample. In spite of improvements in algorithms that match putative peptide sequences to MS/MS spectra, the assembly of these lists of possible or probable peptides into a 'correct' set of proteins is still problematic. We have observed a trend in a simple relationship, derived from standard database search outputs, which can be useful in assessing the quality of a MS/MS-based protein identification. Specifically, the ratio of the protein score and number of non-redundant peptides, or average peptide score (APS), can facilitate initial filtering of database search results in addition to providing a useful measure of confidence for the proteins identified. This parameter has been applied to results from the analysis of multi-protein complexes derived from pull-down experiments analyzed using a two-dimensional LC/MS/MS workflow. In particular, the complex list of protein identifications derived from a drug affinity pull-down with immobilized ampicillin and an E. coli lysate was greatly simplified by applying the APS as a filter, allowing for facile identification of the penicillin-binding proteins known to interact with ampicillin. Furthermore, an APS threshold can be used for any data sets derived from electrospray ionization (ESI)- or matrix-assisted laser desorption/ionization (MALDI)-MS/MS experiments and is also not specific to any database search program.  相似文献   

12.
Gel-based matrix-assisted laser desorption ionization-time of flight tandem mass spectrometer (MALDI TOF/TOF MS) is one of the dominant methods of current proteomics, utilizing both peptide mass fingerprinting (PMF) and peptide fragment fingerprinting (PFF) for protein identification on a spot-to-spot basis. However, the unique impact of the quality of the corresponding mass spectrometry spectra remains largely unreported, and has motivated the development and use of an automatic spectra-assessment method. In this study, a multi-variant regression approach has been utilized to assess spectral quality for both PMF and PFF spectra obtained from MALDI TOF/TOF MS. The assessment index has been applied to investigations of MASCOT search results. Systematic examination of two large-scale sets of human liver tissue data has proved that spectral quality was a key factor in significant matching. Based on large-scale investigations on individual PMF search, individual PFF search and their combination, respectively, the filtering of bad quality spectra or spots proves to be an efficient way to improve search efficiency of all search modes in MASCOT. Meanwhile, a validation method based on score differences between normal and decoy (reverse or random) database searches is proposed to precisely define the positive matches. Further analysis showed that spectral quality assessment was also efficient in representing the quality of 2-DE gel spots and promoted the discovery of potential post-translation modifications.  相似文献   

13.
Nine replicate samples of peptides from soybean leaves, each spiked with a different concentration of bovine apotransferrin peptides, were analyzed on a mass spectrometer using multidimensional protein identification technology (MudPIT). Proteins were detected from the peptide tandem mass spectra, and the numbers of spectra were statistically evaluated for variation between samples. The results corroborate prior knowledge that combining spectra from replicate samples increases the number of identifiable proteins and that a summed spectral count for a protein increases linearly with increasing molar amounts of protein. Furthermore, statistical analysis of spectral counts for proteins in two- and three-way comparisons between replicates and combined replicates revealed little significant variation arising from run-to-run differences or data-dependent instrument ion sampling that might falsely suggest differential protein accumulation. In these experiments, spectral counting was enabled by PANORAMICS, probability-based software that predicts proteins detected by sets of observed peptides. Three alternative approaches to counting spectra were also evaluated by comparison. As the counting thresholds were changed from weaker to more stringent, the accuracy of ratio determination also changed. These results suggest that thresholds for counting can be empirically set to improve relative quantitation. All together, the data confirm the accuracy and reliability of label-free spectral counting in the relative, quantitative analysis of proteins between samples.  相似文献   

14.
Mass spectrometry based proteomic experiments have advanced considerably over the past decade with high-resolution and mass accuracy tandem mass spectrometry (MS/MS) capabilities now allowing routine interrogation of large peptides and proteins. Often a major bottleneck to 'top-down' proteomics, however, is the ability to identify and characterize the complex peptides or proteins based on the acquired high-resolution MS/MS spectra. For biological samples containing proteins with multiple unpredicted processing events, unsupervised identifications can be particularly challenging. Described here is a newly created search algorithm (MAR) designed for the identification of experimentally detected peptides or proteins. This algorithm relies only on predefined list of 'differential' modifications (e.g. phosphorylation) and a FASTA-formatted protein database, and is not constrained to full-length proteins for identification. The algorithm is further powered by the ability to leverage identified mass differences between chromatographically separated ions within full-scan MS spectra to automatically generate a list of likely 'differential' modifications to be searched. The utility of the algorithm is demonstrated with the identification of 54 unique polypeptides from human apolipoprotein enriched from the high-density lipoprotein particle (HDL), and searching time benchmarks demonstrate scalability (12 high-resolution MS/MS scans searched per minute with modifications considered). This parallelizable algorithm provides an additional solution for converting high-quality MS/MS data of multiply processed proteins into reliable identifications.  相似文献   

15.
Peptide acetylation and dimethylation have been widely used to derivatize primary amino groups (peptide N‐termini and the ε‐amino group of lysines) for chemical isotope labeling of quantitative proteomics or for affinity tag labeling for selection and enrichment of labeled peptides. However, peptide acetylation results in signal suppression during electrospray ionization (ESI) due to charge neutralization. In contrast, dimethylated peptides show increased ionization efficiency after derivatization, since dimethylation increases hydrophobicity and maintains a positive charge on the peptide under common LC conditions. In this study, we quantitatively compared the ESI efficiencies of acetylated and dimethylated model peptides and tryptic peptides of BSA. Dimethylated peptides showed higher ionization efficiency than acetylated peptides for both model peptides and tryptic BSA peptides. At the proteome level, peptide dimethylation led to better protein identification than peptide acetylation when tryptic peptides of mouse brain lysate were analyzed with LC‐ESI‐MS/MS. These results demonstrate that dimethylation of tryptic peptides enhanced ESI efficiency and provided up to two‐fold improved protein identification sensitivity in comparison with acetylation. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

16.
Balgley BM  Wang W  Song T  Fang X  Yang L  Lee CS 《Electrophoresis》2008,29(14):3047-3054
Multidimensional separations of the peptides resulting from enzymatic digestions of complex protein mixtures prior to MS/MS, namely shotgun proteomics, is increasingly utilized for large-scale identification and quantitation of proteins. Inherent to the performance of proteomic measurements is the resolving power of each of the separations both separately and in combination. By simply raising the number of CIEF fractions, the resulting enhancement in the overall peak capacity of combined CIEF/nano-RPLC separations greatly reduces the complexity of eluted peptides prior to MS detection and sequencing and increases the proteome coverage. The capabilities of the CIEF-based proteome platform coupled with the spectral counting approach to confidently and reproducibly quantify proteins and changes in protein expression levels among samples are evaluated. Analytical reproducibility of relative protein abundance is determined to exhibit a Pearson R(2) value greater than 0.99 and a CV of 14.1%. The platform is demonstrated to be capable of measuring changes in protein expression as low as 1.5-fold, with confidence following multiple testing adjustment.  相似文献   

17.
The extent and effects of sequence scrambling in peptide ions during tandem mass spectrometry (MS/MS) have been examined using tryptic peptides from model proteins. Sequencescrambled b ions appeared in about 35% of 43 tryptic peptides examined under MS/MS conditions. In general, these ions had relatively low abundances with averages of 8% and 16%, depending on the instrumentation used. A few tryptic peptides gave abundant scrambled b ions in MS/MS. However, peptide and protein identifications under proteomic conditions with Mascot were not affected, even for these peptides wherein scrambling was prominent. From the 43 tryptic peptides that have been investigated, the conclusion is that sequence scrambling is unlikely to impact negatively on the accuracy of automated peptide and protein identifications in proteomics.  相似文献   

18.
Recently various methods for the N-terminal sulfonation of peptides have been developed for the mass spectrometric analyses of proteomic samples to facilitate de novo sequencing of the peptides produced. This paper describes the isotope-coded N-terminal sulfonation (ICenS) of peptides; this procedure allows both de novo peptide sequencing and quantitative proteomics to be studied simultaneously. As N-terminal sulfonation reagents, 13C-labeled 4-sulfophenyl[13C6]isothiocyanate (13C-SPITC) and unlabeled 4-sulfophenyl isothiocyanate (12C-SPITC) were synthesized. The experimental and reference peptide mixtures were derivatized independently using 13C-SPITC and 12C-SPITC and then combined to generate an isotopically labeled peptide mixture in which each isotopic pair differs in mass by 6 Da. Capillary reverse-phase liquid chromatography/tandem mass spectrometry experiments on the resulting peptide mixtures revealed several immediate advantages of ICenS in addition to the de novo sequencing capability of N-terminal sulfonation, namely, differentiation between N-terminal sulfonated peptides and unmodified peptides in mass spectra, differentiation between N- and C-terminal fragments in tandem mass spectra of multiply protonated peptides by comparing fragmentations of the isotopic pairs, and relative peptide quantification between proteome samples. We demonstrate that the combination of N-terminal sulfonation and isotope coding in the mass spectrometric analysis of proteomic samples is a viable method that overcomes many problems associated with current N-terminal sulfonation methods.  相似文献   

19.
This paper describes an algorithm to apply proteotypic peptide sequence libraries to protein identifications performed using tandem mass spectrometry (MS/MS). Proteotypic peptides are those peptides in a protein sequence that are most likely to be confidently observed by current MS-based proteomics methods. Libraries of proteotypic peptide sequences were compiled from the Global Proteome Machine Database for Homo sapiens and Saccharomyces cerevisiae model species proteomes. These libraries were used to scan through collections of tandem mass spectra to discover which proteins were represented by the data sets, followed by detailed analysis of the spectra with the full protein sequences corresponding to the discovered proteotypic peptides. This algorithm (Proteotypic Peptide Profiling, or P3) resulted in sequence-to-spectrum matches comparable to those obtained by conventional protein identification algorithms using only full protein sequences, with a 20-fold reduction in the time required to perform the identification calculations. The proteotypic peptide libraries, the open source code for the implementation of the search algorithm and a website for using the software have been made freely available. Approximately 4% of the residues in the H. sapiens proteome were required in the proteotypic peptide library to successfully identify proteins.  相似文献   

20.
Many laboratories identify proteins by searching tandem mass spectrometry data against genomic or protein sequence databases. These database searches typically use the measured peptide masses or the derived peptide sequence and, in this paper, we focus on the latter. We study the minimum peptide sequence data requirements for definitive protein identification from protein sequence databases. Accurate mass measurements are not needed for definitive protein identification, even when a limited amount of sequence data is available for searching. This information has implications for the mass spectrometry performance (and cost), data base search strategies and proteomics research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号