Statistical protein quantification and prioritization in label-free shotgun LC-MS/MS proteomics
Abstract
Mass spectrometry-based proteomics is the current method of choice for identifying and quantifying the proteome of an organism. Because no prior knowledge is required, the label-free liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) "shotgun" workflow is widely used for discovery investigations, a typical goal of which is to identify candidate biomarkers of disease. A challenge of addressing this goal in the LC-MS/MS workflow is that mass spectrometry-based experiments generate measurements not on intact proteins, but on fragments of a protein called peptides. Additionally, the workflow is biased towards the more abundant proteins in a biological sample, leaving potentially interesting lower abundant proteins undetected. The incomplete representation of the proteome can prohibit the discovery of potentially important candidates, severely limiting the usefulness of LC-MS/MS for biomarker discovery. This dissertation introduces probabilistic models for addressing these challenges. Through the use of multiple case studies of proteomic investigations, we show that the models are more sensitive and specific for detecting changes in protein abundance than existing methods, and that they can help interpret these changes at a network level, which, with the aid of efficient computational algorithms, helps to prioritize putative disease-associated genes.
Degree
Ph.D.
Advisors
Vitek, Purdue University.
Subject Area
Statistics|Bioinformatics
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.