Quasi-three body coarse grain models of protein structure
Knowledge-based methods, such as statstical potentials, for analyzing protein structures primarily consider the distances between pairs of bodies (atoms or groups of atoms). Considerations of several bodies simultaneously are generally used to characterize bonded structural elements or those in close contact with each other, but they historically do not consider atoms that are not in direct contact with each other. In this report, an information-theoretic method is introduced for the purpose of detecting and quantifying distance-dependent through-space multibody relationships between the side chains of three residues. This new technique is capable of producing convergent and consistent results when applied to a sufficiently large database of randomly chosen and experimentally solved protein structures. The results of this study can be shown to reproduce the established physico-chemical properties of residues as well as more recently discovered properties and interactions. A second study was performed to verify the results of the information-theoretic study. A new measure was developed with the intent of detecting joint dependency between two variables while excluding contributions from hidden variables. When used on a high-diversity database of protein structure files, the new measure produced results in strong agreement with the information-theoretic study. In both studies, five residues were found to be highest in quasi-three body signal in both studies, resulting in a relatively small group of triplets deemed to be important. A handful of these triplets were selected for an in-depth clustering study. The clustering study revealed spatial arrangements of residues that were grouped roughly into a number of structural motifs. These spatial arrangements detected by clustering were explainable in terms of the physico-chemical properties of each residue belonging to the triplet. The results of this research provide insight into recent work regarding the physical chemistry of amino acids and their role in the structure, function and evolution of proteins. The techniques and insights presented in this work should be useful in the future development of knowledge-based tools for the evaluation of protein structure.
Lill, Purdue University.
Off-Campus Purdue Users:
To access this dissertation, please log in to our