Extreme value theory in some statistical analysis of genomic sequences |
| |
Authors: | Email author" target="_blank">Lily?WangEmail author Pranab?K?Sen |
| |
Institution: | (1) Department of Biostatistics, Vanderbilt University, S-2323 Medical Center North, Nashville, TN 37232-2158, USA;(2) Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA |
| |
Abstract: | Because similarities in biological sequences often suggest similarities in structures and functions, profile searches using multiple alignment of families of related biological sequences provide useful hints for starting points for experimental investigations in molecular biology. Strategies are formulated for determining statistical significance of scores obtained by searching multiple alignment profiles with databanks, while accommodating for gaps in the profile. The methodology is validated with derivation of asymptotic distribution of the maximum of profile scores, even under weakly dependence conditions. Simulation studies show the proposed method is adequate for moderate sample sizes. The methodology is illustrated with an immunoglobulin protein domain study example. |
| |
Keywords: | Maximum profile scores Protein profile Sequence alignment Statistical significance Weakly dependent |
本文献已被 SpringerLink 等数据库收录! |
|