MSc(Engg) Thesis, Department of Computer Science and Automation,
Indian Institute of Science, Bangalore, India, April 2004.
World Wide Web has been growing at a phenomenal rate since its inception. This exponential growth of WWW has sparked many research activities in improving its performance. In particular, caching of web objects is done at various levels of hierarchy to reduce both the average response time for a web request and the network traffic. Effective cache replacement policies, which help achieve the full benefits of caches, and good prefetching schemes, which can predict future accesses and can bring those pages into the cache, can further improve the performance benefits of caches. This thesis proposes (i) a simple and effective cache replacement policy and (ii) a dynamic prefetching scheme for web caches.
In this thesis we propose a replacement policy, called Simple Equation (SE), for server caches and proxy caches. This replacement policy associates a benefit Value or bValue with each object present in cache. The bValue is based on normalized weight, recency and frequency of access to the object. When one of the existing objects in the cache needs to be replaced, then the object with the lowest bValue is chosen for replacement. We have evaluated the above mentioned replacement policy using server traces and proxy traces. Our results show that the performance of the proposed SE replacement policy is comparable to existing LRU (Least Recently Used) and LUV (Least Unified Value) replacement policies in terms of Hit Ratio, Byte Hit Ratio, and Delay Saving Ratio. The SE policy has an efficient and simple implementation and can be easily incorporated in any existing proxy server or web server. Further, this policy has a low space complexity (O(1) per object) and a time complexity of O(log_2 n) and does not require any parameter tuning or a priori trace analysis.
In this thesis we also propose a prefetching scheme for web servers refered to as DyRiMP a Dynamic Restricted Markov Prefetch scheme. Our prefetch scheme is a restricted form of All 4th order Markov model. It is a dynamic scheme and uses a limited amount of memory. The proposed DyRiMP prediction scheme maintains certain data structures (relating to the underlying Markov model) which are dynamically updated taking into account the changes in user surfing patterns and the web site. Hence this prediction scheme does not require separate training period. We have evaluated the proposed dynamic prediction scheme for server traces from World Cup '98. Our results indicate that the DyRiMP prediction scheme exhibits good prediction accuracy and reasonable coverage. Further, we have analyzed the impact of prefetching on the cache performance. We observe that the prediction schemes improve various performance measures such as Hit Ratio, Byte Hit Ratio, and Delay Saving Ratio.