Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: I. Method development |
| |
Authors: | Deepak Bandyopadhyay Jun Huan Jan Prins Jack Snoeyink Wei Wang Alexander Tropsha |
| |
Affiliation: | 1. GlaxoSmithKline, 1250 S. Collegeville Rd, Mail Stop UP12-210, Collegeville, PA, USA 2. Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS, USA 3. Department of Computer Science, University of North Carolina, CB#3175 Sitterson Hall, Chapel Hill, NC, USA 4. School of Pharmacy, University of North Carolina, CB#7360 Beard Hall, Chapel Hill, NC, USA
|
| |
Abstract: | Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman’s subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|