Automatic Sharing Classification and Timely Push for Cache-coherent Systems
Document Type Unpublished Paper
This paper proposes and evaluates Push Distance Migration (PDM), a dynamic scheme for preemptively sending data from producers to consumers to minimize critical-path communication latency. PDM uses small hardware buffers to dynamically detect sharing patterns and timing requirements. The scheme applies to both intra-node and inter-socket directory-based shared memory networks.
We integrate PDM into a MOESI cache-coherence protocol using heuristics to detect different data sharing patterns, including broadcasts, producer/consumer, and migratory-data sharing. Using the PARSEC and SPLASH-2 benchmark suites, we show that our scheme significantly reduces communication latency in NUMA configurations and achieves up to 46% performance improvement, with at most 0.5% on-chip storage overhead. When combined with existing prefetch schemes, PDM either outperforms prefetching or combines with prefetching for improved performance (up to 15% extra) in most cases.