Parallel computers constructed using conventional processors offer the potential to achieve large improvements in execution speed at reasonable cost, however, these machines tend to efficiently implement only coarse-grain MIMD parallelism. To achieve the best possible speedup through parallel execution, a computer must be capable of effectively using all the different types of parallelism that exist in each program. A combination of SIMD, VLIW, and MIMD parallelism, at a variety of granularity levels, exists in most applications; thus, hardware that can support multiple types of parallelism can achieve better performance with a wider range of codes. In this paper, we introduce a new hardware barrier architecture that provides the full DBM functionality we discussed in [OKDgOa], but can be implemented with much simpler hardware. This mechanism can be used to efficiently support multi-mode moderate-width parallelism with instruction-level granularity (i.e., synchronization cost is approximately one LOAD instruction), as described in the companion paper [CoD94].
Parallel Architecture, Dynamic Barrier Synchronization, MIMD/VLIW/SIMD, Mixed-Mode Computation, Partitionable Systems, Instruction-Level Parallelism
Date of this Version