Improving multicore resource efficiency and performance

Syed Ali Raza Jafri, Purdue University

Abstract

With clock speeds stagnating for the last few years and multi-cores having replaced uniprocessors, software development must now turn towards shared memory parallel programming to continue enhancing performance. Shared memory parallel programming; however is significantly more challenging than its sequential counterpart. Conventional shared memory parallel programs can fall victim to deadlocks, livelocks and data races which are hard to detect and debug. Aside from programming complexity chipmultiprocessors need a scalable, low latency, high bandwidth interconnect fabric to deliver performance. Conventional interconnects such as crossbars and buses can deliver low latency but do not scale with increasing number of cores. Researchers have proposed the transactional memory (TM) model to address the issue of multi-core programmability and multi-hop on-chip networks to provide low latency, high bandwidth communication among cores. However these designs make inefficient use of resources and also fall victim to performance bottlenecks. TM designs require large amount of memory hierarchy space to store metastate. This design requirement poses a significant barrier to TM adoption by commercial vendors. TM designs also suffer from degraded performance because of current conflict resolution policies. Similarly on-chip networks require a significant fraction of total processor energy, and suffer from performance bottlenecks such as head-of-line blocking and poor switch arbitration. In my dissertation, I make common-case observations to propose novel techniques that considerably reduce resource usage and that significantly improve performance.

Degree

Ph.D.

Advisors

Thottethodi, Purdue University.

Subject Area

Computer Engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS