Bayesian Nonparametrics for Biophysics

Meysam Tavakoli, Purdue University

Abstract

The main goal of data analysis is to summarize huge amount of data (as our observation) with a few numbers that come up us with some sort of intuition into the process that generated the data. Regardless of the method we use to analyze the data, the process of analysis includes (1) create the mathematical formulation for the problem, (2) data collection, (3) create a probability model for the data, (4) estimate the parameters of the model, and (5) summarize the results in a proper way-a process that is called ”statistical inference”. Recently it has been suggested that using the concept of Bayesian approach and more specifically Bayesian nonparametrics (BNPs) is showed to have a deep influence in the area of data analysis [1], and in this field, they have just begun to be extracted [2–4]. However, to our best knowledge, there is no single resource yet avail-able that explain it, both its concepts, and implementation, as would be needed to bring the capacity of BNPs to relieve on data analysis and accelerate its unavoidable extensive acceptance. Therefore, in this dissertation, we provide a description of the concepts and implementation of an important, and computational tool that extracts BNPs in this area specifically its application in the field of biophysics. Here, the goal is using BNPs to understand the rules of life (in vivo) at the scale at which life occurs (single molecule)from the fastest possible acquirable data (single photons). In chapter 1, we introduce a brief introduction to Data Analysis in biophysics. Here, our overview is aimed for anyone, from student to established researcher, who plans to understand what can be accomplished with statistical methods to modeling and where the field of data analysis in biophysics is headed. For someone just getting started, we present a special on the logic, strengths and shortcomings of data analysis frameworks with a focus on very recent approaches. In chapter 2, we provide an overview on data analysis in single molecule bio-physics. We discuss about data analysis tools and model selection problem and mainly Bayesian approach. We also discuss about BNPs and their distinctive characteristics that make them ideal mathematical tools in modeling of complex biomolecules as they offer meaningful and clear physical interpretation and let full posterior probabilities over molecular-level models to be deduced with minimum subjective choices. In chapter 3, we work on spectroscopic approaches and fluorescence time traces. These traces are employed to report on dynamical features of biomolecules. The fundamental unit of information came from these time traces is the single photon. Individual photons have information from the biomolecule, from which they are emit-ted, to the detector on timescales as fast as microseconds. Therefore, from confocal microscope viewpoint it is theoretically feasible to monitor biomolecular dynamics at such timescales. In practice, however, signals are stochastic and in order to derive dynamical information through traditional means such as fluorescence correlation spectroscopy (FCS) and related methods fluorescence time trace signals are gathered and temporally auto-correlated over many minutes. So far, it has been unfeasible to analyze dynamical attributes of biomolecules on timescales near data acquisition as this requests that we estimate the biomolecule numbers emitting photons and their locations within the confocal volume. The mathematical structure of this problem causes that we leave the normal (”parametric”) Bayesian paradigm.

Degree

Ph.D.

Advisors

Pressé, Purdue University.

Subject Area

Applied Mathematics|Biophysics|Mathematics|Optics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS