Broadcasting and blocking large data sets with an index tree

Chuan-Ming Liu, Purdue University

Abstract

This thesis investigates problems arising on tree structures that model scenarios involving large data sets in wireless, distributed, and parallel environments. The problems include efficient client and server algorithms for broadcasting and query processing on a multi-dimensional indexed data set, improving the performance of external searching in static trees, and developing new tree-based clustering heuristics tailored towards structural properties. In a wireless mobile environment, the broadcast of data together with an index structure allows a mobile client to tune selectively into the broadcast to obtain desired data and therefore reduces tuning time (i.e., time spent on listening to the broadcast). This thesis investigates query execution on broadcasted multi-dimensional index structures. The solutions minimize the tuning time and latency and efficiently handle the query execution starting at an arbitrary time. The proposed protocols differ in the broadcast schedule format, in how a client manages the information with limited memory, and in the data structures employed by a client. Experimental work on real and synthetic data shows these different protocols lead to different latencies and tuning times. Solutions presented for a multiple channel environment achieve optimal cycle length. This thesis also considers the use of data replication to improve external searching in static tree structures. Efficient mappings from the nodes of a tree to blocks when nodes can be replicated are presented. The amount of node replication is controlled and block utilization and blocknumber are optimized. The results show that, by increasing the total space by at most a factor of 1.5, one can achieve a blocknumber proportional to the optimal blocknumber. The result also shows that when each node can be replicated a constant number of times, no significant reduction in the blocknumber may be possible. Tree clustering partitions the nodes of a tree into disjoint sets, called clusters, subject to balancing the number of nodes and minimizing the number of subtrees in each cluster. We show that such a clustering problem is NP-complete.

Degree

Ph.D.

Advisors

Hambrusch, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS