Combinatorial Methods for Counting Pattern Occurrences in a Markovian Text

Yucong Zhang, Purdue University

Abstract

In this dissertation, we provide combinatorial methods to obtain the probabilistic multivariate generating function that counts the occurrences of patterns in a text generated by a Markovian source. The generating function can then be expanded into the Taylor series in which the power of a term gives the size of a text and the coefficient provides the probabilities of all possible pattern occurrences with the text size. The analysis is on the basis of the inclusion-exclusion principle to pattern counting (Goulden and Jackson, 1979 and 1983) and its application that Bassino et al. (2012) used for obtaining the generating function in the context of the Bernoulli text source. We followed the notations and concepts created by Bassino et al. in the discussion of distinguished patterns and non-reduced pattern sets, with modifications to the Markovian dependence. Our result is derived in the form of a linear matrix equation in which the number of linear equations depends on the size of the alphabet. In addition, we compute the moments of pattern occurrences and discuss the impact of a Markovian text to the moments comparing to the Bernoulli case. The methodology that we use involves the inclusion-exclusion principle, stochastic recurrences, and combinatorics on words including probabilistic multivariate generating functions and moment generating functions.

Degree

Ph.D.

Advisors

Ward, Purdue University.

Subject Area

Computer science|Operations research

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS