Abstract

Graphics Processing Units (GPUs) are growing increasingly popular as general purpose compute accelerators. GPUs are best suited for applications which have abundant data parallelism wherein the computation expressed as a single thread can be applied over a large set of data items. One key constraint that affects application performance on GPUs is that the underlying hardware is single-instruction, multiple data (SIMD) hardware which requires parallel instructions from the multiple threads to execute in a lock-step manner. The benefits of lock-step execution can be seriously degraded if the threads diverge (because of memory or branches). Specifically in the case of memory, the addresses from each thread in a SIMD "wavefront/warp" must be coalesced to enable parallel memory access to minimize divergence. ^ The general problem of coalescing assumes arbitrary address distribution which can be slow. This thesis aims to exploit intra-warp address monotonicity (as measured in a recent study by Holic) to achieve fast memory coalescing. Holic's study reveals the intra-warp addresses are monotonically increasing or decreasing in the common case. The key contributions of this thesis are twofold. First, I design novel hardware coalescing mechanisms to achieve fast-coalescing and quantify the area/delay of my coalescing designs. Second, I quantify the impact of fast-coalescing on overall GPU performance for a suite of GPU benchmarks.

Degree Type

Thesis

Degree Name

Master of Science in Electrical and Computer Engineering (MSECE)

Department

Electrical and Computer Engineering

Committee Chair

Mithuna S. Thottethodi

Date of Award

Spring 2015

Recommended Citation

Rodriguez-Simmonds, Hector, "Exploiting intra-warp address monotonicity for fast memory coalescing in GPUs" (2015). Open Access Theses. 602.
https://docs.lib.purdue.edu/open_access_theses/602

First Advisor

Mithuna S. Thottethodi

Committee Member 1

Anand Raghunathan

Committee Member 2

Milind Kulkarni

Download

Included in

Computer Engineering Commons

COinS

Open Access Theses

Exploiting intra-warp address monotonicity for fast memory coalescing in GPUs

Abstract

Degree Type

Degree Name

Department

Committee Chair

Date of Award

Recommended Citation

First Advisor

Committee Member 1

Committee Member 2

Included in

Search

Links

Links for Authors

Browse

Open Access Theses

Exploiting intra-warp address monotonicity for fast memory coalescing in GPUs

Author

Abstract

Degree Type

Degree Name

Department

Committee Chair

Date of Award

Recommended Citation

First Advisor

Committee Member 1

Committee Member 2

Included in

Share

Search

Links

Links for Authors

Browse