MSc(Engg) Thesis, Department of Computer Science and Automation,
Indian Institute of Science, Bangalore, India, April 1999.
Distributed Shared Memory (DSM) systems provide a shared memory abstraction over a physically distributed collection of machines. They combine the ease of programming of shared memory systems and the scalability of the distributed memory systems while trying to minimize the drawbacks of both. A class of DSMs called Distributed Virtual Shared Memory (DVSM) systems provide the shared memory abstraction in software on distributed memory systems. These systems rely on a set of library routines, called DSM layer, to implement the shared memory abstraction, the coherence of shared data and the synchronization primitives. The application program, also called the application layer, for the DVSM systems is written using these primitives. As a result, in a DVSM systems there is a great degree of interaction between the DSM layer and the application program.
Studies in the past have alluded to the possible performance degradation of the caches in DVSM systems due to the interaction between DSM and application layers. This is a result of frequent transfer of control to DSM routines and pollution of caches with DSM data that is not directly related to the application in execution. But none of these studies quantified the cache pollution or the possible interference and cooperation between the application and the DSM layers. A main contribution of this thesis is studying the intereference and cooperation between the application and DSM layers in a quantitative manner. We developed a detailed classification of memory references that are made by applications executing in a DVSM system and identified the references as improving or degrading the cache performance. We conducted similar studies for Translation LookAside Buffer (TLB) also. Our studies showed that the interference and cooperation between application and DSM layers in caches and TLBs in small, though prefetching improves the application performance in terms of execution time, aggressive prefetching degrades the performance of caches and TLBs.