A graph theoretic approach for identifying RNA structure and function relationships

Kejie Li, Purdue University

Abstract

Understanding of structure-function mapping is crucial to the study of the nature of biopolymers. This mapping can be used to extract information to aid in the prediction of molecular function based on structural topological patterns. This study presents a graph theoretical approach for understanding RNA structural topological features, and revealing the mapping from biological RNA structural topological features to biological functions. We have built a package that represents ensembles of suboptimal RNA structures as a graph, the XIOS graph, for easy structural comparison and analysis by an extended version of the gSpan algorithm. In order to detect structural similarities, The Neighbor Indexing algorithm has been extended by adding additional RNA structure-specific information, and introducing the concept of an RNA structural fingerprint, from a structural descriptor point of view, to represent the topological information of ensembles of RNA structures. Based on the cIndex feature selection strategy, I have developed and applied a new feature selection approach for RNA structures which reveals important structural topological patterns that provide specific information about the functional class of RNAs. This information can be used to relate RNA structural patterns to function. In addition, I have developed a novel structure indexing and database searching method for finding RNAs with similar characteristics (topological modules). ^ It is remarkable that even without using RNA primary sequence information RNA structures can be classified into the correct classes. By combining information from both sequence and topology, unclassified or misclassified RNAs can be correctly classified and categorized with high confidence. The structure-based classification described here is significantly better than sequence-based classification using Blast (Kolmogorov-Smirnov test).^

Degree

Ph.D.

Advisors

Michael Gribskov, Purdue University.

Subject Area

Biology, Molecular|Chemistry, Biochemistry|Biology, Bioinformatics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS