Caching may operate anywhere between the application level (the Web browser) and the remote HTTP server. Roughly we can distinguish cache policies according to three levels where they operate: client level, network level, and server level. Further distinctions may be done at each level.
Client-level caches are the easiest to implement, and most of the current Web browsers have a built-in caching mechanism. There are two basic variants of this approach: persistent caches, which preserve the stored files from a session to the next one (implemented in Netscape), and non-persistent caches, which deallocates the cache at the end of each session (Mosaic has only this form of cache). Client-level caches are not very efficient, as the cached documents cannot be shared with other users.
At the network level, one can distinguish between LAN caches and proxy caches. LAN caches consist of a storage managed by the clients of a single LAN; essentially, this is a distributed cache. Proxy caching is the most spread form of caching. Proxy servers were developed for use in Internet firewalls. It turned out that they constitute a very good place for caching. Studies have shown that small groups of users tend frequently to access the same information; therefore caching in the proxy server should be very effective. Although the hit rate was proved to be not much higher than that of client-level caches, the storage requirements are obviously a lot smaller. Proxy caching has other advantages, too: it is able to give a fast response without requiring high bandwidth, and it can be used to screen inappropriate content (if not for other reasons, at least to decrease the load on the network).
Finally, a cache can be run also on a Web server, in which case it is sometimes called an accelerator. In particular, one way of reducing the latency at the server level is by keeping the most frequently requested files in the main memory. As the file system calls are very expensive in time, on a fast network (FDDI, ATM), the time to receive a document from a local Web server is dominated by the server's latency; therefore, main memory caching is likely to provide good performance.
It is not that one should employ only one of the above levels to get web caching. The above levels can be used so as to complement each other so as to get a overall good caching performance.