Investigations of optimal residue contact definitions for template-based protein structure prediction

Chao Yuan, Purdue University

Abstract

Optimal Definitions of Residue Contact Definitions: Effective encoding of residue contact information is crucial for protein structure prediction since it has a unique role to capture long-range residue interaction as compared with other commonly used scoring terms. The residue contact information can be incorporated in structure prediction in several different ways, including in the form of statistical residue contact potentials or to use predicted residue contacts as restraints or an additional term in the scoring function. To seek the most effective definition of residue contacts for the template-based protein structure prediction, we evaluated thirty different contact definitions, varying bases of contacts and distance cutoffs, in terms of their ability to identify proteins of the same fold. We found that, overall, the residue contact pattern can distinguish protein folds best when contacts are defined for residue pairs whose Cβ atoms are at 6.5 Å or closer to each other. Lower fold recognition accuracy was observed when inaccurate threading alignments were used to identify common residue contacts between protein pairs. In the case of threading, alignment accuracy strongly influences the fraction of common contacts identified among proteins of the same fold, which eventually affects the fold recognition accuracy. The largest deterioration of the fold recognition was observed for β-class proteins when the threading methods were used because the average alignment accuracy was worst for this class. When results of fold recognition were examined for individual proteins, we found that the effective contact definition depends on the fold of the proteins. A larger distance cutoff is often advantageous for capturing spatial arrangement of the secondary structures, which are not physically in contact. For capturing contacts between neighboring β strands, considering the distance between C&agr; atoms is better than the Cβ-based distance because the side-chain of interacting residues on β strands sometimes point to opposite directions. Participation in CASP9: We participated the 9th Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP9), using template-based protein structure prediction method SUPRB. Structure models were evaluated by comparing to experimentally solved structures in terms of RMSD values and GDT-TS scores. Results suggested valid performance on some of the targets yet room of improvement for SUPRB method.

Degree

M.S.

Advisors

Kihara, Purdue University.

Subject Area

Bioinformatics|Biophysics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS