Reuse distance-based shared-memory parallel program analysis

Derek L Schuff, Purdue University

Abstract

As multicore processors implementing shared-memory programming models have become commonplace, analysis tools for shared-memory programs have become increasingly important. One common general-purpose analysis method is reuse distance, which measures locality in application memory reference behavior and is used for predicting cache performance, driving compiler-based optimization, and visualization and manual optimization of programs. This thesis presents and validates methods to extend reuse distance analysis of application locality characteristics to shared-memory multicore platforms by accounting for invalidation-based cache-coherence and inter-core cache sharing. Existing reuse distance analysis methods track the number of distinct addresses referenced between reuses of the same address by a given thread, but do not model the effects of data references by other threads. This thesis shows several methods to keep reuse stacks consistent so that they account for invalidations and cache sharing, either as references arise in a simulated execution or at synchronization points. To decrease the performance impact of analysis and allow analysis of long-running programs This thesis presents a sampled, parallelized method of measuring reuse distance profiles for multithreaded programs, modeling private and shared cache configurations. The sampling technique allows it to spend much of its execution in a fast low-overhead mode, and allows the use of a new measurement method since sampled analysis does not need to consider the full state of the reuse stack. Finally, the analysis is extended to model the effect of hardware prefetchers, which do not exploit program locality, but have a significant impact on the cache performance.

Degree

Ph.D.

Advisors

Pai, Purdue University.

Subject Area

Computer Engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS