Homological Representatives in Topological Persistence

Tao Hou, Purdue University

Abstract

Harnessing the power of data has been a driving force for computing in recently years. However, the non-vectorized or even non-Euclidean nature of certain data with complex structures also poses new challenges to the data science community. Topological data analysis (TDA) has proven effective in several scenarios for alleviating the challenges, by providing techniques that can reveal hidden structures and high-order connectivity for data. A central technique in TDA is called persistent homology, which provides intervals tracking the birth and death of topological features in a growing sequence of topological spaces. In this dissertation, we study the representativeproblem for persistent homology, motivated by the observation that persistent homology does not pinpoint a specific homology class or cycle born and dying with the persistence intervals. Furthermore, studying the representatives also leads us to new findings for related problems such as persistence computation. First, we look into the representative problem for (standard) persistence homology and term the representatives as persistent cycles. We define persistent cycles as cycles born and dying with given persistence intervals and connect the definition to interval decomposition for persistence modules. We also look into the computation of optimal (minimum) persistent cycles which have guaranteed quality. We prove that it is NP-hard to compute minimum persistent p-cycles for the two types of intervals in persistent homology in general dimensions (p > 1). In view of the NP-hardness results, we then identify a special but important class of inputs called weak (p + 1)-pseudomanifolds whose minimum persistent p-cycles can be computed in polynomial time. The algorithms are based on a reduction to minimum (s, t)- cuts on dual graphs. Second, we propose alternative persistent cycles capturing the dynamic changes of homological features born and dying with persistence intervals, which the previous persistent cycles do not reveal. We focus on persistent homology generated by piecewise linear (PL) functions and base our definition on an extension of persistence called the levelset zigzag persistence. We define a sequence of cycles called levelset persistent cycles containing a cycle between each consecutive critical points within the persistence interval. Due to the NP-harness results proven previously, we propose polynomial-time algorithms computing optimal sequences of levelset persistent p-cycles for weak (p+ 1)-pseudomanifolds. Our algorithms draw upon the idea of relating optimal cycles to min-cuts in a graph that we exploited earlier for standard persistent cycles. Note that levelset zigzag poses non-trivial challenges for the approach because a sequence of optimal cycles instead of a single one needs to be computed in this case. Third, we investigate the computation of zigzag persistence on graph inputs, motivated by the fact that graphs model real-world circumstances in many applications where they may constantly change to capture dynamic behaviors of phenomena. Zigzag persistence, an extension of the standard persistence incorporating both insertions and deletions of simplices, is one appropriate instrument for analyzing such changing graph data. However, unlike standard persistence which admits nearly linear-time algorithms for graphs, such results for the zigzag version improving the general O(mω) time complexity are not known, where ω < 2.37286 is the matrix multiplication exponent.

Degree

Ph.D.

Advisors

Dey, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS