Abstract

Generation of active protein chimeras is a valuable tool to probe the functional space of proteins. Statistical modeling is the next logical step, allowing us to build a model of gene fragment replaceability between species. In this thesis I begin to develop the statistical tools that are needed to systematically describe combinatorial protein libraries. I present three sets of diverse chimeric protein libraries developed using sequence information. The statistical model of the human N-Ras and human K-Ras-4B genes reveal a set previously unidetifed surface residues on the N-Ras G-Domain that may be involved in cellular localization. Statistical modeling of a library of chimeric proteins between A. thaliana cinnamate 4-hydroxylase (AtC4H) and S. moellendorffii cinnamate 4-hydroxylase (SmC4H) reveal a possible stabilizing effect of the N-terminal amino acids from SmC4H and, irreplaceable catalytic domains between AtC4H and SmC4H. I also show gene fragment replaceability on a small scale between functionally divergent AtC4H and A. thaliana ferulate 5-hyrdoxylase proteins. Finally, I show that commonly occurring residue pairs in the sequence record are effective covariates when modeling activity in the AtC4H-SmC4H chimeric library.

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Biological Science

Committee Chair

Daisuke Kihara

Date of Award

Fall 2013

First Advisor

Alan M. Friedman

Committee Member 1

Alan M. Friedman

Committee Member 2

Cynthia Stauffacher

Committee Member 3

Clinton Chapple

Share

COinS