Virtualize or Keep It Real? a Study of GPU Instruction Set Architectures and Their Effects in Simulations

Akshay Jain, Purdue University

Abstract

Modern Graphics Processing Units (GPU) are used for accelerating highly parallel compute workloads. They are widely used in both industry and academia for accelerating regular applications like those often found in machine learning. These programs employ a two step compilation process, where the high level language is first converted into an intermediate language representation defined by the vendor, called a Virtual Instruction Set Architecture (vISA). When the hardware configuration is known, it is compiled into the specific machine ISA (mISA) so that it can run on the GPU hardware. The vISA acts as a level of indirection between statically compiled binaries and the real mISA that runs on hardware. For the last decade, researchers in academia and industry have used cycle-level GPU architecture simulators to evaluate future designs. Open source simulators, like GPGPU-Sim support running the vISA, and have limited support for mISAs. The community has generally accepted the results of vISA simulation, due to the relatively high correlation numbers reported on published simulators, but no in-depth analysis of either vISA or mISA simulation has been studied. This thesis analyzes numerous aspects of the architecture, validating the simulation results against real hardware using both the vISA and mISA. Importantly, this thesis asks and answers several key questions: (1) What are the measurable differences between vISA and mISA simulation, (2) What impact will the differences in these ISAs have on published architecture research? (3) How much does the currently accepted correlation of a single architecture against multiple workloads indicate the accuracy of a simulator meant to model different architectures? We study the virtual and machine ISA implementations from Nvidia on GPGPU-Sim, the most popular open-source GPGPU simulator used in academia. We examine several important architectural characteristics in both vISA and mISA simulation, correlating these values against real hardware. We then perform a case study with a particular work from the GPU architecture domain, Cache Conscious Warp Scheduling (CCWS), demonstrating that the choice of ISA would have had little impact on the end results of that work. Finally, we draw conclusions on the current state of GPU simulation methodology from our data. We demonstrate that while there are a number of critical differences between the vISA and mISA that can impact proposed hardware changes, a subset of work will be equally valid in both vISA and mISA simulation. We also make the surprising observations that when using currently accepted GPU simulation methodology, there are instances where the vISA can more accurately model the baseline hardware than the mISA.

Degree

M.S.E.C.E.

Advisors

Rogers, Purdue University.

Subject Area

Computer Engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS