PXAlign: A parallel implementation of the XAlign application

Aditi Magikar, Purdue University

Abstract

Proteomics involves the assessment of a large number of protein molecules. Mass spectrometry is a proteomic tool that is used for assessment of these protein molecules. The Proteome Discovery Pipeline at Purdue carries out data processing and discovery of proteins using mass spectrometry-based proteomics. The Proteome Discovery Pipeline is divided into stages. Each stage does a different computation task. Currently, each stage of the pipeline is executed in a serial manner. The XAlign stage of the pipeline enables data processing and alignment of the protein peaks across different samples. The XAlign stage deals with vast amounts of data and this can be a potential data processing bottleneck of the pipeline. This stage of the pipeline is currently executed in a serial manner. This causes a bottleneck as the processors cannot process the data fast enough. The thesis work introduces parallelism in the XAlign application code in order to investigate whether it reduces the time needed to process the data. The XAlign application code is implemented using commonly used parallelization techniques called MPI and OpenMP. Parallelization of XAlign could potentially reduce the data bottleneck and lead to a speedup of the XAlign stage of the pipeline and speedup of the overall PDP. This is significant as it would lead to faster processing of samples through the pipeline and lead to more samples being processed in a given time frame.

Degree

M.S.

Advisors

Springer, Purdue University.

Subject Area

Bioinformatics|Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS