One and three-dimensional analysis of solution scattering data: Improved radius of gyration estimation and gold cluster labeling for inter-residue distance approximation and residue localization

Hyun Chul Lee, Purdue University

Abstract

This thesis presents improvements in the analysis of solution scattering data to obtain one dimensional (1-D) (distance) and three dimensional (3-D) (spatial) information. The traditional Guinier approach for analyzing solution scattering data in 1-D relies on an approximate linear relationship between the log of the scattering intensities, I(q), and the square of the scattering vector magnitude, q. This relationship allows estimating radius of gyration, Rg, from the slope of this line and estimating forward scattering, I(0), from the intercept. The linear relationship results from successive approximations applied to the underlying Debye formula relating I(q) and P(r). Here, we developed an alternative approach based on polynomial fitting to an expansion of the Debye formula. We show the relative effectiveness of polynomial fitting in most simulated situations and its successful application to experimental data. Furthermore, polynomial fitting directly estimates higher moments of the P( r) distribution, which allows rapid evaluation of protein elongation and shape of P(r) without the complications of indirect Fourier transformation. Inherent spherical symmetry attainable from solution scattering limits the 1-D and 3-D structural information. To overcome this restriction, we have simulated and experimentally tested labeling specific amino acids with gold cluster reagents. We first present the expected scattering properties of gold cluster labeled proteins and show that it matches experimental data. Inter-cluster distances (and 1-D information) are estimated by three methods: difference curve analysis, grid search under a spherical scattering approximation, and rigid body modeling. Among three methods, the distance between gold clusters (ABdist) is estimated most accurately by difference curve analysis (0–0.3 Å error). Grid search under a spherical approximation gives larger estimation error of ABdist (0.3–2.1 Å error). However, this method requires fewer data sets, and distances between a protein and gold clusters (PAdist and PBdist) in addition to ABdist are estimated by this method (0.1–7.7 Å error). Although computationally intensive, rigid body modeling (RBM) performs well for estimating all three inter-body distances (0–0.5 Å error for ABdist, 0–0.6 Å error for PAdist or PBdist). Besides 1-D analysis, we developed the RBM tools to predict gold cluster positions on a protein. Simulated scattering curves of templates are compared to those of trial structures, which are generated by exhaustive rigid body searches for a fixed protein with mobile gold cluster(s). Gold center positions were predicted from distance-corrected angular grid RBM, where inter-body distances of each trial structure were corrected by least squares minimization. In the analysis, double-labeled data were combined with single-labeled data using the best performing scoring function, which improves the prediction accuracy of gold center positions (0.7-1.5 Å error by the angular grid RBM with 3 Å averaged grid spacing without rotation). In addition, the RBM with a bead model of a protein was performed on experimental data from a protein with unknown structure.

Degree

Ph.D.

Advisors

Friedman, Purdue University.

Subject Area

Biology|Biophysics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS