Researchers have proposed clustered microarchitectures to capture the benefits of high performance and high energy efficiency. Typically, clustered microarchitectures offer fast local bypasses (i.e., value forwarding between instructions) within clusters and require global bypasses to take longer, more than one cycle. With communication locality (i.e., most communication is within the clusters) the clustered designs capture the benefits of both improved instructions per cycle and increased clock-frequency. Traditional clustered microarchitectures are implemented by partitioning the register file and associated functional units to clusters. In this work, an alternate technique is demonstrated -- Incomplete bypassing -- to achieve similar clustering. Incomplete bypass based clustering is similar to traditional clustering in that it creates groups of functional units where intra-group communication occurs within a single cycle over fast bypass wires and inter-group communication takes longer, more than one cycle. One key difference is that in traditional clustered microarchitectures, inter-cluster communication takes place over the global buses whereas incomplete bypass designs achieve inter-group communication via the register file. It is demonstrated that incomplete bypass based clustered micro-architecture achieves higher performance (10% speedup) and better energy efficiency than traditional clustered microarchitectures.

Date of this Version

December 2007


Electrical and Computer Engineering

Month of Graduation


Year of Graduation



Master of Science in Electrical and Computer Engineering

Advisor 1 or Chair of Committee

M. S. Thottethodi

Committee Member 1

T. N. Vijaykumar

Committee Member 2

Y. Lu