Date of Award

12-2017

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Statistics

Committee Chair

Rebecca W. Doerge

Committee Co-Chair

Hyonho Chun

Committee Member 1

Heejung Shim

Committee Member 2

Lingsong Zhang

Committee Member 3

Vikki Weake

Abstract

Advanced imaging and scanning technologies are providing an abundance of shape-based data that require novel approaches to understand the factors that generate them. This is especially true in modern agricultural practices where large quantities of highly detailed morphometric data are collected and used for studying genetic associations. Toward this end there is an immediate need for efficient algorithms that extract relevant geometric information from images in a form that can be used within a statistical framework. Topological Data Analysis (TDA) has shown to be both fast and effective for studying complex shapes, and can simultaneously reduce dimensionality into simpler topological summaries (referred to as a persistence diagram) while capturing essential geometric information about the overall image shape. In an agricultural setting, and otherwise, TDA has the capacity to quantify plant morphological images across multiple organs, but fails to provide a form that lends itself to further statistical analysis. Specifically, there is a significant gap in our understanding of the methods that adapt TDA summaries so that they can be analyzed using the arsenal of statistical theories and methodologies that are provided by such a rich field of study. Although there is at least one approach that uses kernel methods applied to persistence diagrams to make them amenable to existing statistical methods, very few of these kernel methods make use of all the topological information contained in a persistence diagram without experiencing a significant loss of information. To overcome this, WaveTDA is introduced as a Bayesian approach based on wavelets that lends itself to both regression analysis and hypothesis testing. Further, WaveTDA has the capacity to identify regions of the kernel transformed persistence diagrams (under the assumption of independence) that are significantly associated with (genetic) covariates. While this work is motivated and presented in the context of genetic studies, it is general enough to be used in a variety of applications.

Share

COinS