Scalable effort hardware

Vinay K Chippa, Purdue University

Abstract

For decades, the integrated circuit (IC) design process, which refines a design across levels of design abstraction from algorithms to architectures to gates and circuits, has adhered to the principle that strict equivalence (numerical or Boolean) must be maintained with the design specification throughout this process. Indeed, verification methodologies and tools are used to ensure such equivalence. In this dissertation, we propose a new design paradigm that departs from this long-held axiom. This research is motivated by "inherent application resilience'', an interesting property exhibited by a growing class of applications from existing and emerging domains such as graphics, multimedia, recognition, mining, search etc. This property enables these applications to produce acceptable outputs despite incorrectness in their underlying computations, thereby offering entirely new avenues for optimizing hardware power and performance. In this dissertation, we demonstrate that, for a wide range of applications, the traditional approach of maintaining perfect equivalence through the design process from algorithm-to-architecture-to-gates-to-transistors is an overkill. We propose scalable effort hardware as an approach to tap the reservoir of application resilience and translate it into highly efficient hardware implementations. Scalable effort hardware is characterized by three key principles: 1) The notion of functional correctness is re-defined from Boolean equivalence to acceptable "quality'' of the outputs. 2) Hardware systems are designed to provide a tradeoff between quality and efficiency (performance, power consumption etc.) This tradeoff may be explored during system operation, based on the application context and requirements on output quality. 3) The objective of scalable effort hardware design is to provide the most favorable quality efficiency tradeoff possible, i.e., for a given output quality the performance should be as high as possible or the energy consumed should be as low as possible. We first motivate scalable effort hardware in a broader context by analyzing inherent application resilience in a benchmark suite consisting of widely used applications from the domains of recognition, mining and search. We propose a resilience characterization framework that partitions an application into resilient and sensitive parts, and characterizes the resilient parts to quantify the impact of effort scaling mechanisms. Based on the application of the framework to the benchmark suite, we present key insights that serve as guidelines for design techniques that exploit application resilience. We next describe a design approach for scalable effort hardware, which is based on (i) the identification of scaling mechanisms at various levels of design abstraction (algorithm, architecture and circuit), which are used to modulate the effort expended by the hardware implementation, and (ii) the synergistic co-optimization of these scaling mechanisms to best exploit application resilience and achieve desirable efficiency-quality tradeoffs. We have designed an energy-efficient Recognition and Mining (RM) processor based on the proposed scalable effort design approach and fabricated it in TSMCs 65nm process node. We present measurement results from this chip and demonstrate that scalable effort hardware can achieve energy savings of 2-20X with minimal impact on output quality. Another key contribution of this dissertation is the concept of Dynamic Effort Scaling (DES), which refers to dynamic management of the scaling mechanisms employed in scalable effort hardware. We argue the need for DES by observing that the degree of application resilience often varies significantly across applications, across datasets and even within a dataset. We propose a conceptual framework for DES by formulating it as a feedback control problem, wherein the scaling mechanisms are regulated with the goal of maintaining output quality within a certain specified limit. We demonstrate that further energy savings of 2-3X can be achieved by employing DES on top of scalable effort hardware. Finally, we present a scalable effort Stochastic Recognition and Mining (StoRM) processor that employs stochastic computing to realize resilient and compute intensive kernels of RM applications. Stochastic computing uses a stream of bits to represent data where the probability of ones in the bit-stream is determined by the numerical magnitude of the data. We identify stochastic bit stream length modulation, which determines the accuracy of stochastic computing output, as highly effective effort scaling mechanism to optimize the energy consumption of the StoRM processor for a given output quality. We propose various circuit and architecture level design techniques to overcome the intrinsic limitations of stochastic computing and improve the energy efficiency of the StoRM processor. We present results from the evaluation of a suite of RM applications on the StoRM processor and show that the scalable effort StoRM processor is 5X more efficient in terms of energy-latency product compared to a conventional implementation.

Degree

Ph.D.

Advisors

Raghunathan, Purdue University.

Subject Area

Electrical engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS