HIGHLY PARALLEL PROCESSING OF RELATIONAL DATABASES

CHING-CHIH HSIAO, Purdue University

Abstract

New computer architectures are feasible because of the advances in VLSI design and fabrication technologies. Among them, highly parallel structures coordinate hundreds of thousands of processing elements that function cooperatively. These structures are especially useful in solving computationally intensive problems. This thesis applies the highly parallel approach to improve the efficiency in processing relational database queries. High-performance algorithms for basic relational operations are explored. Efficient composition of these algorithms to process whole queries is also investigated. Regularity and uniformity are necessary in order to make the highly parallel computing cost-effective. An efficient primitive, called POP-SORT, is proposed to unify the relational operations such as sorting, duplicate-removal, union, intersection, and difference. The three latter operations are even allowed to have multisets as operands. POP-SORT is based on an easy scheme which adapts any highly parallel and regular sorting algorithm to perform all these database operations. The primitive is compared favorably, in terms of time complexity, with existing algorithms for the five operations. The optimality of POP-SORT is also proved for a restricted but reasonable type of parallel computation. Furthermore, sublinear time performance is possible for join operations if argument relations are preconditioned by POP-SORT. For processing a whole query, the operation tree parsed from the query can be executed by composing individual algorithms for the operations. The Configurable, Highly Parallel (CHiP) computers have the flexibility to provide programmable processor interconnections for composing algorithms. Query embedding is a method of executing whole operation trees to explore maximum parallelism on the CHiP computers. It involves the processor allocation and the embedding of appropriate interconnections. With bitonic POP-SORT, which is a generalization of Batcher's bitonic merge sort, the query embedding can be simplified significantly.

Degree

Ph.D.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS