Structural information in strings and graphs

Yong Wook Choi, Purdue University

Abstract

Structures exist everywhere, from tiny molecules to the universe, from physical buildings to abstract social networks. In this study, we aim at understanding how much information is embodied in structures. We focus on two fundamental abstractions of objects, strings and graphs, by studying two problems – constrained pattern matching and compression of graphical structures. In strings, one can find a structure in the form of constraints or patterns. Thus we consider the constrained pattern matching problem where we ask how many times a given pattern occurs in the so-called (d, k) constrained binary sequences generated by a memoryless binary source. We present simple and precise asymptotics for the mean and variance of the number of pattern occurrences. We also compute the asymptotic formulas for the probability that there are r occurrences of a given pattern in a (d, k) sequence for different ranges of r. In graphs, we find structures by considering unlabeled graphs. We propose a compression algorithm and prove that our algorithm is asymptotically optimal in the Erdo&huml;s-Rényi random graph model. We compute a lower bound for the lossless compression of graphical structures and show that, with high probability, our algorithm achieves this lower bound up to the first two leading terms. We use combinatorial and analytic techniques such as combinatorial calculus, generating functions, and complex asymptotics.

Degree

Ph.D.

Advisors

Szpankowski, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS