Abstract
Graphical Processing Units (GPUs) offer massive, highly-efficient parallelism, making them an attractive target for computation-intensive applications. However, GPUs have a separate memory space which introduces the complexity of manually handling explicit data movements between GPU and CPU memory spaces. Although GPU kernels/libraries have made it easy to improve application performance by offloading computation to GPUs, unfortunately it is very difficult to manually optimize CPU-GPU communication between multiple kernel invocations to avoid redundant communication when using these kernels with complex applications. ^ In this thesis, we introduce SemCache, a semantics-aware GPU cache that automatically manages CPU-GPU communication in addition to optimizing communication by eliminating redundant transfers using caching. It uses library semantics to determine the appropriate caching granularity for a given offloaded library (e.g., matrices). Our caching technique is efficient; it only tracks matrices instead of tracking every memory access at fine granularity. We applied SemCache to Basic Linear Algebra Subprograms (BLAS) library to provide a GPU drop-in replacement library which requires no program rewriting or annotations. ^ SemCache++ extends SemCache to support offloading to multiple GPUs. SemCache++ is used to build the first multi-GPU drop-in replacement library that (a) uses the virtual memory to automatically manage and optimize multi-GPU communication and (b) requires no program rewriting or annotations. SemCache++ also enables new features like asynchronous transfers, parallel execution and overlapping communication with computation. Experimental results show that our system can dramatically reduce redundant communication for real-world computational science application and deliver significant performance improvements, beating GPU-based implementations like MAGMA, CULA, CUBLAS, StarPU and CUBLASXT
Degree Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Engineering
Committee Chair
MILIND KULKARNI
Date of Award
Spring 2015
Recommended Citation
Al-Saber, Nabeel, "SemCache: Semantics-Aware Caching for Efficient GPU Offloading" (2015). Open Access Dissertations. 413.
https://docs.lib.purdue.edu/open_access_dissertations/413
First Advisor
MILIND KULKARNI
Committee Member 1
ARUN PRAKASH
Committee Member 2
SAMUEL P. MIDKIFF
Committee Member 3
VIJAY S. PAI