Tuesday, March 18, 2014

Client side cache controlling - Content Based

The modern day developer has a wide variety of techniques and technologies available to improve application performance and end-user experience. One of the most frequently overlooked technologies is that of the HTTP cache. By using HTTP cache the applications greatly benefits by improving response times and reducing server load.

HTTP caching techniques are always associates with the client side. The high level view of the client side caching mechanism is depicted in the following diagram.

(1) When the client application requires a service from a remote server, first it will search through it's cache storage for a similar request(The client cache is a mapping between requests and responses that it has received). If the cache contains a similar request then the client application will require to know whether is it the most recent response generated by the server.

(2) So the client application will send the request to the server with a unique identifier mapped to the response that it already has.

(3) The server will then look at the identifier and determine whether is the response that the client has still valid. If it is the valid response it will just send back the response with the HTTP status code '304' which signifies 'not modified' without any payload.

(4) Then the client will be able to use it's cached copy as the correct response to it's request.

The cache control mechanism change with the identifier that the client use to communicate with the server. There are two major cache controlling mechanisms. Those are,

Time based caching
Content based caching

In time based caching the client will use the HTTP cache control tag 'If-Modified-Since' and the server user the cache control tag 'Last-Modified'. The server will always send a time stamp in the Last-Modified field in the HTTP header. So when the client requires to know whether the response that it currently has is out dated, it simply sends the request with that time stamp in the If-Modified-Since field in HTTP header of the request.

The following section will describe the content based caching mechanism in detail.

Scenario 01:

(1) The client will search it's cache storage for a response that is mapped with the request that it has at the moment.

(2) If there is no entry in the cache that maps with the relevant request then it will be a cache miss.

(3) So the client will send the request to the server without any cache control headers.

(4) The server will process the request and send back the response with a unique identifier to that response (In most cases this will be a hash value of the response message context [MD5 message digest algorithm commonly used for generating the hash value.]). The unique identifier will be set in the HTTP header field 'ETag'. So the value is known as the ETag value.

(5) After the client receiving the response from the server it will save it and the ETag value in the client cache storage mapped with the relevant request.

Scenario 02:

(1) The client will search for a response in it's cache storage.

(2) And this time it finds a matching response to the request.

(3) Client will send the request to the server with the ETag value set up in the HTTP header field "If-None_Match".

(4) The server will process the request and generate the response. Then it will compute the hash value of the response.

(i) If that value matches with the value in the request's If-None-Match field that means the response has not modified since. So the server will send the response with the HTTP status code 304 without a payload in the response.

(ii) If that value does not match with the value in the request's If-None-Match field that means the response has modified. So the server will send the updated response with it's new hash value in the ETag field of the response.

(5) If the client received a message with status code 304 it can use the appropriate response from it's cache storage. And if it receives a message with status code 200 it has to update it's cache storage with the new values.

Friday, March 14, 2014

Apache Bloodhound : Batch create tickets from wiki list

Abstract:

Apache Bloodhound is a software development collaboration tool, including issue tracking, wiki and repository browsing. A major part of issue tracking is creating tickets on issues. Currently Apache Bloodhound provides a method for creating tickets by filling a form with relevant information. But as it provides the functionality just for add one ticket a time, for a user who requires to create a large number of tickets could be really tedious process. So the major idea of this project is to implement a functionality which will make it possible for the users to batch-create tickets by converting WikiFormatted tables into set of tickets.

Idea

Built on top of Trac Apache Bloodhound is a software development collaboration tool, including issue tracking, wiki and repository browsing (see:http://bloodhound.apache.org). Bloodhound extends Trac (http://trac.edgewall.org/) with multiple product support, advanced search functionality, ticket relations, a simpler installer and sleeker user interface. Apache Bloodhound recently graduated from the Apache Incubator as a stand-alone Apache project.

Apache Bloodhound makes itself uniquely special by providing significantly great user experience. So while improving the core functionalities, maintaining and improving the current user experience is really important for the projects development. After completing this project it will boost Apache Bloodhound into a new better user experience level.

Apache Bloodhound already provides a method for the users to track issues by providing functionality of creating tickets. It is a dropdown form which contains the some of the relevant fields to be filled and saved in the backend database. Somehow if the user requires to fill one or more of the other fields that doesn't display in the dropdown he will be redirected into a new page which contains the complete form of relevant fields. If the user just wants add one ticket at a time this method will work fine. But if the user wants to a large number of tickets he has to go over and over again through this process which will be really exhausting experience for the user. As I mentioned earlier one of the major power of Apache Bloodhound is it's great user experience. So it is really necessary to provide a separate method which will make this process easier for the users.

So the proposed idea is to provide functionality for the users to batch-create tickets by just converting an existing wiki table into tickets. That is the users will be able to create a 'ticket table' with a simple macro.

Format of the macro will be:

[[TicketTable(numberOfTickets)]]

(Figure 01)

Figure 01

As it conveys the macro takes an user argument "number of tickets" that the user is going to add to the ticket table. Then the macro will be able to render a table with required fields as column headers and "numberOfTickets" number of empty rows to be filled by users. After filling the table the users will be able to click on the save button which will be positioned under the ticket table and batch-create tickets. After that the wiki page will be modified, that is the table will be modified into a normal ticket query table where the ticket id and ticket summary will be links redirecting to appropriate tickets that have been created.(Figure 02 and Figure 03)

Figure 02

Figure 03

When retrieving the tickets it is required to identify the tickets that have been created. For that the idea is to retrieve based on the creation time. For that it will be require to extend the TicketQuery macro to handle absolute time stamp values.(Figure 04)

[[Widget(TicketQuery, query="created=2007-01-01 00:00:00..2008-01-01 23:59:59", max=10, title=Batch Created Tickets)]]

Figure 04

And obviously in an organization where they use Apache Bloodhound as there issue tracking tool they will be a need to control the use of this functionality. That is this functionality should not be given to each and every user in the organization. Because miss using this functionality will lead into a huge mess as it is capable of creating huge numbers of tickets by a single click of the user. So the idea is to create a new permission level for using this functionality. Admins of the organization will be the only ones who will granted with this permission level 'TICKET_BATCH_CREATE' by default, and they will be able to grant it to other users as they wish.

However the best thing about this feature is ,as this uses simple macros which can be used within wiki syntax, the users will be able to use this anywhere where it supports wikiFormating. So hopefully this will be a real nice feature which will increase the usability of Apache Bloodhound significantly.

Time Line

About me

I’m Dammina Sahabandu a third year undergraduate at University of Moratuwa (Department of Computer Science and Engineering) which has had the most Google Summer of Code students for an impressive 7 years in a row. And currently I’m doing my internship at WSO2 which is completely open source middleware organization who has a really close relationship with Apache Software Foundation[2]. In WSO2 I’m working in the ESB(Enterprise Service Bus) team which is built upon Apache Synapse and Apache Axis 2. So I have great experience on working with open source community and have a good idea about their standards. And working with large code bases is my everyday work.

Also all the technologies related to this project are very much familiar to me. So I feel really confident and looking forward to complete this project successfully.

[1] http://google-opensource.blogspot.com/2013/07/google-summer-of-code-full-of-stats.html

[2] http://wso2.com/

Saturday, March 8, 2014

Introduction to Apache Bloodhound - Installation

Briefly, Apache Bloodhound is a web based project management and bug tracking system written in "python". It is build on top of Trac (which is another open source web based project management and bug tracking system maintained by Edgewall Software) and it is Apache top level project developed and maintained by volunteer programmers at Apache Software Foundation.

Here in this post I'm going to describe how to install and setup Apache Bloodhound in Ubuntu using bloodhound_setup.py python script. [For Windows users I found this video really helpful]

Installing Prerequisites

Install Python:

Download python from this page. Version should be between 2.6 and 3.0 (I have downloaded Python 2.7.6). Extract it to a preferred location.

tar -xJf Python-2.7.6.tar.xz

(Rename the filename appropriately)

Run following commands on the terminal to install python.(Some of them may require administrator access)

./configure

make

make test

make install

Install Setuptools:

You just need to run the following command in the terminal.

wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -O - | sudo python

Install pip:

Run the command on terminal to install pip using package managers.

sudo apt-get install python-pip

Install virtualenv:

Run,

pip install virtualenv

to install virtual environment.(Administrator access may require)

Install Database:

Although Apache Bloodhound supports many database systems currently the bloodhound_setup.py installer currently only sets up either SQLite or PostgreSQL databases.

If you are planning to use Apache Bloodhound for large production it will be better to install PostgreSQL otherwise it will be really easier for you to go on with SQLite.

Install and setup SQLite:

Download an appropriate SQLite version from here (Download sqlite-autoconf-3080300.tar.gz from "Source Code" section). Extract it to a preferred location.

tar xvfz sqlite-autoconf-3080300.tar.gz

Run following commands to build and install SQLite.

cd sqlite-autoconf-3080300

./configure

make

make install

Installing Apache Bloodhound

If you have followed the steps without facing any issue, congratulations! you have installed all the required prerequisites successfully. Now the next step is to build the product.

First checkout Apache Bloodhound from Apache subversion server using following command.

svn co https://svn.apache.org/repos/asf/bloodhound/trunk bloodhound

Now run the following two commands to configure and setup the environment.

cd bloodhound/installer

virtualenv --system-site-packages bloodhound

To enable the virtual environment run the following command.

source ./bloodhound/bin/activate

Now you should be able to see that the shell has activated a virtual environment like follows.

Now to install the required python packages run,

pip install -r requirements-dev.txt

Now you are at the final stage. Run the following command to setup bloodhound.

python bloodhound_setup.py

Shell will ask the following question,

Answer n as you had installed SQLite. Then the shell will ask for an admin user name and password. Provide them as you prefer.

Now you are all set to run Bloodhound. Your final step is to run,

tracd ./bloodhound/environments/main --port=8000

and check at,

http://localhost:8000/main/

Now you should be able to see the welcome page of Apache Bloodhound.

Enjoy!