Tuesday, March 18, 2014

Client side cache controlling - Content Based

The modern day developer has a wide variety of techniques and technologies available to improve application performance and end-user experience. One of the most frequently overlooked technologies is that of the HTTP cache. By using HTTP cache the applications greatly benefits by improving response times and reducing server load.

HTTP caching techniques are always associates with the client side. The high level view of the client side caching mechanism is depicted in the following diagram.


(1) When the client application requires a service from a remote server, first it will search through it's cache storage for a similar request(The client cache is a mapping between requests and responses that it has received). If the cache contains a similar request then the client application will require to know whether is it the most recent response generated by the server.

(2) So the client application will send the request to the server with a unique identifier mapped to the response that it already has.

(3) The server will then look at the identifier and determine whether is the response that the client has still valid. If it is the valid response it will just send back the response with the HTTP status code '304' which signifies 'not modified' without any payload.

(4) Then the client will be able to use it's cached copy as the correct response to it's request.

The cache control mechanism change with the identifier that the client use to communicate with the server. There are two major cache controlling mechanisms. Those are,
  1. Time based caching
  2. Content based caching
In time based caching the client will use the HTTP cache control tag 'If-Modified-Since' and the server user the cache control tag 'Last-Modified'. The server will always send a time stamp in the Last-Modified field in the HTTP header. So when the client requires to know whether the response that it currently has is out dated, it simply sends the request with that time stamp in the If-Modified-Since field in HTTP header of the request.

The following section will describe the content based caching mechanism in detail.

Scenario 01:


(1) The client will search it's cache storage for a response that is mapped with the request that it has at the moment.

(2) If there is no entry in the cache that maps with the relevant request then it will be a cache miss.

(3) So the client will send the request to the server without any cache control headers.

(4) The server will process the request and send back the response with a unique identifier to that response (In most cases this will be a hash value of the response message context [MD5 message digest algorithm commonly used for generating the hash value.]). The unique identifier will be set in the HTTP header field 'ETag'. So the value is known as the ETag value.

(5) After the client receiving the response from the server it will save it and the ETag value in the client cache storage mapped with the relevant request.

Scenario 02:


(1) The client will search for a response in it's cache storage.

(2) And this time it finds a matching response to the request.

(3) Client will send the request to the server with the ETag value set up in the HTTP header field "If-None_Match".

(4) The server will process the request and generate the response. Then it will compute the hash value of the response.
    (i) If that value matches with the value in the request's If-None-Match field that means the                response has not modified since. So the server will send the response with the HTTP status          code 304 without a payload in the response.

    (ii) If that value does not match with the value in the request's If-None-Match field that means           the response has modified. So the server will send the updated response with it's new hash           value in the ETag field of the response.

(5) If the client received a message with status code 304 it can use the appropriate response from it's cache storage. And if it receives a message with status code 200 it has to update it's cache storage with the new values.