Date of Award
Spring 2015
Degree Type
Thesis
Degree Name
Master of Science in Electrical and Computer Engineering (MSECE)
Department
Electrical and Computer Engineering
First Advisor
Mithuna S. Thottethodi
Committee Chair
Mithuna S. Thottethodi
Committee Member 1
Anand Raghunathan
Committee Member 2
Milind Kulkarni
Abstract
Graphics Processing Units (GPUs) are growing increasingly popular as general purpose compute accelerators. GPUs are best suited for applications which have abundant data parallelism wherein the computation expressed as a single thread can be applied over a large set of data items. One key constraint that affects application performance on GPUs is that the underlying hardware is single-instruction, multiple data (SIMD) hardware which requires parallel instructions from the multiple threads to execute in a lock-step manner. The benefits of lock-step execution can be seriously degraded if the threads diverge (because of memory or branches). Specifically in the case of memory, the addresses from each thread in a SIMD "wavefront/warp" must be coalesced to enable parallel memory access to minimize divergence. ^ The general problem of coalescing assumes arbitrary address distribution which can be slow. This thesis aims to exploit intra-warp address monotonicity (as measured in a recent study by Holic) to achieve fast memory coalescing. Holic's study reveals the intra-warp addresses are monotonically increasing or decreasing in the common case. The key contributions of this thesis are twofold. First, I design novel hardware coalescing mechanisms to achieve fast-coalescing and quantify the area/delay of my coalescing designs. Second, I quantify the impact of fast-coalescing on overall GPU performance for a suite of GPU benchmarks.
Recommended Citation
Rodriguez-Simmonds, Hector, "Exploiting intra-warp address monotonicity for fast memory coalescing in GPUs" (2015). Open Access Theses. 602.
https://docs.lib.purdue.edu/open_access_theses/602