Profile of tries

Gahyun Park, Purdue University

Abstract

Tries and suffix trees are the most popular data, structures on words. Tries were introduced in 1960 by Fredkin as an efficient method for searching and sorting digital data. Since then a myriad of novel trie applications has been found such as dynamic bashing, conflict resolution algorithms, leader election algorithms, IP addresses lookup, coding, polynomial factorization, and Lempel-Ziv compression schemes. Furthermore, various analyses of tries reveal new fundamental properties of strings appearing in those applications. Parameters of interest are the (partial) fillup level (the largest full level of the trie), shortest path, height (longest path), typical depth, and path length (sum of depths). All of these parameters are analyzed here in a unifying manner by studying the external and internal profiles. A profile of a tree at level k is the number of nodes (internal or external) at level k. We derive recurrences for both profiles and solve them asymptotically for various ranges of k when the strings stored in the trie are generated by a memoryless source (extension to a Markov source is possible). In particular, we present asymptotic results for the average profile, the variance and the limiting distribution. As consequences we find typical behaviors of the height, shortest path, fillup level, and the depth. These results are derived here by methods of analytic algorithmics such as generating functions, Mellin transform, poissonization and depoissonization, and the saddle point method.

Degree

Ph.D.

Advisors

Szpankowski, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS