Compiler techniques for speculative execution

Seon Wook Kim, Purdue University

Abstract

The major specific contributions are: (1) We introduce a new compiler analysis to identify the memory accesses which do not need to be speculatively buffered, and develop an algorithm and a compiler infrastructure for instruction labeling in order to eliminate unnecessary buffering of speculative data. Our results show that for our benchmarks, over 60% of the references in non-parallelizable code sections do not need to be speculatively buffered. (2) We propose a compiler-assisted speculative execution (CASE) model which allows the compiler to communicate the idempotency property to the hardware in order to minimize the speculative states to preserve the sequential semantics of the program. (3) We introduce code transformation and generation techniques to exploit both implicit and explicit thread-level parallelism within a single application for our initial research prototype speculative microarchitecture, Multiplex. The main idea of this architecture is that the compiler will explicitly identify parallel code sections that can be executed in parallel. The remaining sections are considered as speculative regions because the compiler is not certain that these sections can be executed in parallel. It is then left to the speculative hardware to determine whether or not the parallelism among the speculative regions can be exploited, i.e. implicitly. We develop a compiler heuristic to decide the thread execution mode at compile time and runtime in order to reduce the explicit threading overhead. As a result, we improve the speedup by up to 155.0% and 54.8% on average over an implicit-only architecture. (4) We describe the compiler infrastructure to integrate the Polaris parallelizing compiler with a code generator, and to pass the compiler's program analysis information to the hardware. The Polaris parallelizing compiler is used for thread selection and generation. The code generator provides information, such as thread and access attributes, from the Polaris compiler to the hardware to reduce the intrinsic overhead of speculative execution. We tightly integrated each of the compiler components with the compilation process to exploit thread-level parallelism and to label access attributes in a fully automated manner, while related projects evaluate performance on parts of applications only. (Abstract shortened by UMI.)

Degree

Ph.D.

Advisors

Eigenmann, Purdue University.

Subject Area

Computer science|Electrical engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS