Statistical methods for integrating epigenomic results

Suk-Young Yoo, Purdue University

Abstract

Epigenetics is the study of heritable alterations in gene function without changing the DNA sequence. It is known that epigenetic modifications such as DNA methylation and histone modifications are highly correlated with regulation of gene expression. Tiling array technology has become a popular tool for studying genomic and epigenomic phenomena. This work studies the relationship between gene expression and epigenetic modifications by applying two unique statistical approaches. First, a well known meta-analytic method known as vote counting, is employed, where results from individual analyses of differential expression, methylation, and ChIP-chip tiling arrays are combined for the purpose of identifying tiles that are significantly associated via their differential expression, methylation, and chromatin modifications (H3mK4 or H3mK9). Second a novel two-stage statistical approach is proposed that employs a hidden Markov model and a linear model to assess gene expression as related to DNA methylation. In the first stage, a hidden Markov model is employed to estimate the methylation status per tile by utilizing information of neighboring tiles. In the second stage, a linear model is applied to identify statistically significantly differentially expressed tiles given the estimated methylation status. Simulation studies and real data analysis provide an assessment of the statistical power of the proposed two-stage analysis.

Degree

Ph.D.

Advisors

Doerge, Purdue University.

Subject Area

Statistics|Bioinformatics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS