Next: Greedy Dual-Size Algorithm Up: Cache Replacement Policies Previous: Type of document

Types of Policies

We describe cache replacement algorithms proposed in recent studies, which attempt to minimize various cost metrics, such as miss ratio, byte miss ratio, average latency, and total cost. Below we give a brief description of all of them. In describing the various algorithms, it is convenient to view each request for a document as being satisfied in the following way: the algorithm brings the newly requested document into the cache and then evicts documents until the capacity of the cache is no longer exceeded. Algorithms are then distinguished by how they choose which documents to evict. This view allows for the possibility that the requested document itself may be evicted upon its arrival into the cache, which means it replaces no other document in the cache.

1.

Least-Recently-Used (LRU) : evicts the document which was requested the least recently.

2.

Least-Frequently-Used (LFU) : evicts the document which is accessed least frequently.

3.

Size : evicts the largest document.

4.

LRU-Threshold : is the same as LRU, except documents larger than a certain threshold size are never cached;

5.

Log(Size)+LRU : evicts the document who has the largest log(size) and is the least recently used document among all documents with the same log(size).

6.

Hyper-G : is a refinement of LFU with last access time and size considerations;

7.

Pitkow/Recker : removes the least-recently-used document, except if all documents are accessed today, in which case the largest one is removed;

8.

Lowest-Latency-First : tries to minimize average latency by removing the document with the lowest download latency first;

9.

Hybrid : is aimed at reducing the total latency. A function is computed for each document which is designed to capture the utility of retaining a given document in the cache. The document with the smallest function value is then evicted. The function for a document p located at server s depends on the following parameters: c_s, the time to connect with server s, b_s the bandwidth to server s, n_p the number of times p has been requested since it was brought into the cache, and z_p, the size (in bytes) of document p. The function is defined as:

$\begin{displaymath}\frac{(c_s + \frac{W_b}{b_s})(n_p)^{W_n}}{Z_p} \end{displaymath}$

where W_b and W_n are constants. Estimates for c_s and b_s are based on the the times to fetch documents from servers in the recent past.

10.

Lowest Relative Value (LRV) : includes the cost and size of a document in the calculation of a value that estimates the utility of keeping a document in the cache. The algorithm evicts the document with the lowest value. The calculation of the value is based on extensive empirical analysis of trace data. For a given i, let P_i denote the probability that a document is requested i + 1 times given that it is requested i times. P_i is estimated in an online manner by taking the ratio D_i+1/D_i, where D_i is the total number of documents seen so far which have been requested at least i times in the trace. P_i(s) is the same as P_i except the value is determined by restricting the count only to pages of size s. Furthermore, let 1- D(t) be the probability that a page is requested again as a function of the time (in seconds) since its last request t; D(t) is estimated as

$\begin{displaymath}D(t) = 0.035 \log(t + 1) + 0.45 (1-e^{\frac{-t}{2e\sigma}}) \end{displaymath}$

Then for a particular document d of size s and cost c, if the last request to d is the i'th request to it, and the last request was made t seconds ago, d's value in LRV is calculated as:

$V(i,t,s)= \begin{cases} P_1(s)(1-D(t))c/s \hspace{15pt} if i=1 &\\ P_i(1-D(t))c/s \hspace{15pt} otherwise\\ \end{cases}$

Among all documents, LRV evicts the one with the lowest value. Thus, LRV takes into account locality, cost and size of a document.

11.

Greedy Dual-Size : It assigns a value H with each page, which is initially the cost in bringing the page H values are altered as newer pages are inserted into the cache. Every time a page that is cached is accessed it's H value is refreshed to its original value. The algorithm selects the document with the lowest H value for replacement. The algorithm integrates the locality and cost concerns in a seamless fashion.

We will now study an algorithm which give optimal results.

Next: Greedy Dual-Size Algorithm Up: Cache Replacement Policies Previous: Type of document

Anil Gracias
2001-01-18