RNA-protein interactions: Analysis of binding interfaces and prediction of protein binding sites in RNA

Aditi Gupta, Purdue University

Abstract

RNA-protein interactions are vital to many biological processes such as translation and splicing. Analysis of the binding interfaces in RNA-protein complexes obtained from the Protein Data Bank reveal molecular properties in RNA and protein that are statistically favored in binding regions as opposed to non-binding regions. For example, although the nucleotide guanine is preferred when RNA bases form hydrogen bonds with the proteins, it is disfavored when the RNA backbone interacts with the protein. Protein binding is favored in RNA loop regions over those that form Watson-Crick base-pairs. For proteins, positively charged amino acids such as Arginine are frequently observed interacting with the negatively charged RNA backbone. Aromatic protein residues are also seen stacking with the nucleotides. Such insights into recognition principles governing RNA-protein interactions can be translated into computational prediction of binding sites in participating RNAs and proteins, thus aiding in their functional annotation. Because the statistical analysis revealed that RNA has distinctive sequence and structure at protein binding and non-binding sites, computational prediction of protein binding sites in RNA is possible. We developed an information theoretic model that predicts protein binding sites in RNA with 60% accuracy. By using a conditional random field model, we identified the sequence and structural characteristics that are indicative of protein binding in RNA. We find that RNA structure is much more informative than RNA sequence in distinguishing protein binding from non-binding sites. Since experimentally determined structural information is not available for several RNAs, we developed a heuristic approach to identify a comprehensive set of base-paired regions in RNA from suboptimal structure predictions. Development of tools to predict RNA-protein interaction partners is a future research direction that will allow computational construction of RNA-protein interaction network for a biological process or a system.

Degree

Ph.D.

Advisors

Kihara, Purdue University.

Subject Area

Molecular biology|Biochemistry|Bioinformatics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS