An inferential approach to protein backbone nuclear magnetic resonance assignment
Nuclear Magnetic Resonance (NMR) spectroscopy is a key data source for genome-wide studies of the three-dimensional structures of proteins. Like many experiments in molecular biology, NMR spectroscopy generates noisy and incomplete data, and the existing analysis tools are error-prone and lack scalability. These problems can be addressed by developing methods of statistical inference specifically designed for NMR data, combined with new algorithms for carrying out the inference in these noisy data sets. ^ This dissertation introduces a Bayesian approach to a particular step of the NMR-based procedure, backbone resonance assignment. The approach is based on a Gaussian graphical model where informative priors are derived from existing NMR databases. A difficulty lies in the exploration of the combinatorially large and jagged posterior landscape of candidate graphs. We develop an algorithm that, instead of examining one candidate graph at a time, recursively partitions the graph space into smaller subspaces. The resulting tree structure is searched using a complete algorithm where the importance of the branches is learned from previously visited nodes. We demonstrate the accuracy and scalability of the inferential procedure using a range of simulated and experimental data. ^
Major Professors: Chris Bailey-Kellogg, Purdue University, Bruce A. Craig, Purdue University.