Data dependence speculation allows a compiler to relax the constraint of data-independence to issue tasks in parallel, increasing the potential for automatic extraction of parallelism from sequential programs. This paper proposes hardware mechanisms to support a data-dependence speculative distributed shared-memory (DDSM) architecture that enable speculative parallelization of programs with irregular data structures and inherent coarse-grain parallelism. Efficient support for coarse-grain tasks requires large buffers for speculative data; DDSM leverages cache and directory structures to provide large buffers that are managed transparently from applications. The proposed cache and directory extensions provide support for distributed speculative versions of cache blocks, run-time detection of dependence violations, and program-order reconciliation of cache blocks. This paper describes the DDSM architecture and presents a simulation-based evaluation of its performance on five benchmarks chosen from the Spec95 and Olden suites. The proposed system yields simulated speedups of up to 12.5 xn a 16-node configuration for programs with coarse-grain speculative windows (millions of instructions and hundreds of KBytes of speculative data).

Date of this Version

May 2000