Abstract: | In recent years, more and more algorithms and software for reconstruction of partial or entire amino acid sequences by the mass spectra of peptides appear. However, with rare exception, such sequences always contain errors due to many reasons like a chemical noise in the spectrum, incomplete fragmentation, etc. Posttranslational modifications of proteins cause additional difficulties. In this paper, we suggest a PepTiger algorithm, which can correctly identify peptides in a database by de novo sequences containing errors. The algorithm is based on the method of approximate string matching and a specially developed system of scoring, which takes into account the string distance between the de novo sequence and the sequence of the peptide candidate in the database, the difference between their masses, and the similarity between the experimental mass spectrum and the theoretical spectrum of the peptide candidate. The algorithm suggested here correctly identifies a larger number of de novo sequences than other algorithms for identification of peptides by their de novo sequences. |