Journal of Instruction-Level Parallelism 2011
vol 13, pp. 1--28, March, 2011
Loads that miss in L1 or L2 caches, and are waiting for their data at the head of the ROB, cause significant slow down in the form of commit stalls. We identify that most of these commit stalls are caused by a small set of loads, referred to as LIMCOS (Loads IncurringMajority of COmmit Stalls). We propose simple history-based classifiers that track commit stalls suffered by loads to help us identify this small set of loads. In this paper we study two prefetching enhancements enabled by classifiers.
In the first enhancement, the classifiers are used to train the prefetcher to focus on the misses suffered by LIMCOS. This, referred to as focused prefetching, results in a 9.8% gain in IPC over naive GHB based delta correlation prefetcher along with a 20.3% reduction in memory traffic for a set of 17 memory-intensive SPEC2000 benchmarks. Another important impact of focused prefetching is a 61% improvement in the accuracy of prefetches. We demonstrate that the proposed classification criterion performs better than other existing criteria like criticality and delinquent loads Also we show that the criterion of focusing on commit stalls is robust enough across cache levels and can be applied to any prefetcher without any modifications to the prefetcher. We also demonstrate the positive impact that Focused Prefetching has in a multi-core scenario. In the case of global history based prefetchers, we demonstrate not only the applicability of focused prefetching, but also the second enhancement based on classifiers – filtering of prefetches once they are generated.