Friday, December 30, 2016

Setting up dev environment for Apache Bloodhound with IntelliJ Pycharm

For the development of any web application with a Python based backend I would recommend IntelliJ's PyCharm IDE. It provides the facilities such as jumping into field/class definitions, extracting methods, refactoring variables. It also automatically infer the types and provides intelligent code completion. And the most amazing thing about PyCharm is its debugger that is integrated into the IDE.

To new contributors of Apache Bloodhound setting up the IDE is pretty straight forward task.

Before setting up the IDE it is required to do the basic environment for Apache Bloodhound by following the installation guide.

After checking out the project code and, creating the virtual environment start PyCharm and follow the following steps to setup the dev environment.

1. Open the project code from Pycharm. From the `File` menu select `Open` and browse through the IDE's file browser to select the base directory that contains the Bloodhound code.

2. In the IDE preferences setup a local interpreter and point it to Python executable in Bloodhound environment.





Local interpreter should point to the Python executable at,
<bloodhound-base-dir>/installer/<environment-name>/bin/python

3. Finally it is required to create a new run configuration in PyCharm.

Add a new `Python` run configuration.

Add the following parameters,

Script: <bloodhound-base-dir>/trac/trac/web/standalone.py
Script Parameters: <bloodhound-base-dir>/installer/bloodhound/environments/main --port=8000
Python Interpreter: Select the added local Python interpreter from the list


Save this configuration and you are good to write and debug Apache Bloodhound code with IntelliJ PyCharm.

Saturday, December 12, 2015

Why AdroitLogic AS2 Gateway/Station

Why AS2?


There are two popular specifications published for business to business communication over a network.

1. File Transfer Protocol (FTP) is a standard network protocol used to transfer computer files from one host to another host over TCP based network such as internet.
RFC

2. Application Statement 2 (AS2) is a specification about how to transport data securely and reliably over the internet.
RFC

If we are going to compare the above two protocols we have to consider the following key factors to ensure which one works best in this domain.

The first factor that I'm going to compare is security. It is really important to transfer the enterprise data securely from one endpoint to another. The plain FTP protocol does not support security but the enhanced versions of FTP commonly known as SFTP (Secure File Transfer Protocol) does. By using the SFTP protocol you can secure your business data by encrypting the complete communication channel and through that channel you can send and receive your business data without worrying about securing the data itself. But adding this additional security for plain FTP will increase the cost. On the other hand AS2 protocol support security by securing the data itself rather than securing the transport channel. To do that AS2 supports encryption of data for the security, signing the data for validation and hashing process to ensure data has not been changed while transferring. Also if you want additional security you can encrypt the data transfer channel with SSL on top of payload encryption.

The second factor I'm going to discuss here is non-repudiation. That is ability to ensure the authenticity of the senders signature on the data at the receivers end. Unfortunately FTP any related protocol doesn't support any kind of non-repudiation ensuring mechanism. AS2 protocol uses the idea of digital certificates to ensure messages are securely transported to the intended business partner.

In the B2B communication another important feature required is to get to know whether messages got where they were intended to and at that receivers end message got successfully decrypted and verified. FTP partially support this by replying the number of bytes received at the receivers end to the sender but there is no guarantee to ensure whether the data got processed and verified without an issue. But AS2 uses the idea of Message Disposition Note commonly known as the MDN file which contains all the details about processing at the receivers end. That file will be replied back to the original sender.

The final factor I'm going to discuss here is the cost of using these two protocols for your B2B communication. FTP is very popular so setting up a FTP gateway can be done very quickly and the administrative overhead of the gateway will be very low. That will increase only by adding additional features mentioned in the above sections. On the other hand AS2 B2B communication will require specific software and technical expertise, that means the cost of a AS2 gateway will be very high compared to FTP gateway.

Why AdroitLogic AS2 Gateway/ Station?



From the above section you will understand the AS2 protocol is way better than FTP protocol for your B2B communication. The only issue regarding AS2 is the cost to set up and the technical expertise that you require to maintain and manage the service. So we at adroitlogic provides you with a perfect solution. If you are a small scale businessman who can't afford AS2, you can register at our hosted AS2Gateway service at http://free.as2gateway.org/ which is completely free of charge to use. There are some limitations enforced on number of messages that you can send and receive for a month but you can try it out for free of charge.

The solution that we recommend for enterprise grade customers is hosted at http://as2gateway.org/ where we guarantee 24x7 up time and many other additional features. You can find the difference here at our pricing page http://as2integration.org/display/AS2/AS2Gateway+Features+and+Pricing+Model. To guide you on how to use these portals there are many guides published. I recommend the white paper submitted in our company web page and this blog post.

Other than these two solutions now we provide a onsite deploy-able AS2 Station solution for the enterprise B2B communication with extra features like file polling from the file system. For this solution we'll provide development support and after deployment 24x7 production support.

So I'll be back soon with the new introducing features of our solution. Thank you!

Sunday, September 13, 2015

ACL (Access Control List) is one of the functionalities that is not widely used among Apache zookeeper users. But zookeeper provides a powerful API which makes it really easier for the users to add security to clustering environments. Zookeeper ACL is a similar idea to Linux File Systems Access Control Lists. After starting the zookeeper server by running zkCli commands it is possible to view and setup ACLs for data directories.

To view data in a directory:
        get <path-to-directory>
To view ACL of the directory:
        getAcl <path-to-directory>
To authenticate a user:
        addauth <scheme> <username>:<password>
Following are the built in schemes of Apache Zookeeper: {quoting from Apache Zookeeper Docs.}

* world: has a single id, anyone, that represents anyone.
* auth: doesn't use any id, represents any authenticated user.
* digest: uses a username:password string to generate MD5 hash which is then used as an ACL ID identity. Authentication is done by sending the username:password in clear text. When used in the ACL the expression will be the username:base64 encoded SHA1 password digest.
* host: uses the client host name as an ACL ID identity. The ACL expression is a hostname suffix. For example, the ACL expression host:corp.com matches the ids host:host1.corp.com and host:host2.corp.com, but not host:host1.store.com.
* ip: uses the client host IP as an ACL ID identity. The ACL expression is of the form addr/bits where the most significant bits of addr are matched against the most significant bits of the client host IP.

To set an ACL:
        setAcl <path-to-directory> <scheme>:<username>:<password>:<permission>
Following are the declared permissions of Apache Zookeeper: {quoting from Apache Zookeeper Docs.}

* CREATE: you can create a child node
* READ: you can get data from a node and list its children.
* WRITE: you can set data for a node
* DELETE: you can delete a child node
* ADMIN: you can set permissions

Ex: scheme - digest; path to directory - /zookeeper/temp; username - user; password - pwd;
addauth digest user:pass
setAcl /zookeeper/temp auth:user:pwd:crw

You can also use the Java API provided by Apache ZooKeeper to implement this ACL within your code. I'll write a blog post soon to guide you on how to do that.

Friday, May 8, 2015

IntelliJ IDEA: How to restore default settings [Ubuntu]

Recently my IDE started to act weird. There were issues like I can't use the key board short cuts (ie: can't use at least to keys at once). So the IDE did almost act like the vi editor ;)

Anyhow I did solve this issue by restoring default IDE UI settings.

You can easily do it by deleting the current configurations by running the following commands.

rm ~/.IntelliJIdeaXX/config        (lets you reconfigure user-specific settings.)
rm ~/.IntelliJIdeaXX/system        (lets you reconfigure IntelliJ IDEA data caches.)

Tuesday, March 18, 2014

Client side cache controlling - Content Based

The modern day developer has a wide variety of techniques and technologies available to improve application performance and end-user experience. One of the most frequently overlooked technologies is that of the HTTP cache. By using HTTP cache the applications greatly benefits by improving response times and reducing server load.

HTTP caching techniques are always associates with the client side. The high level view of the client side caching mechanism is depicted in the following diagram.


(1) When the client application requires a service from a remote server, first it will search through it's cache storage for a similar request(The client cache is a mapping between requests and responses that it has received). If the cache contains a similar request then the client application will require to know whether is it the most recent response generated by the server.

(2) So the client application will send the request to the server with a unique identifier mapped to the response that it already has.

(3) The server will then look at the identifier and determine whether is the response that the client has still valid. If it is the valid response it will just send back the response with the HTTP status code '304' which signifies 'not modified' without any payload.

(4) Then the client will be able to use it's cached copy as the correct response to it's request.

The cache control mechanism change with the identifier that the client use to communicate with the server. There are two major cache controlling mechanisms. Those are,
  1. Time based caching
  2. Content based caching
In time based caching the client will use the HTTP cache control tag 'If-Modified-Since' and the server user the cache control tag 'Last-Modified'. The server will always send a time stamp in the Last-Modified field in the HTTP header. So when the client requires to know whether the response that it currently has is out dated, it simply sends the request with that time stamp in the If-Modified-Since field in HTTP header of the request.

The following section will describe the content based caching mechanism in detail.

Scenario 01:


(1) The client will search it's cache storage for a response that is mapped with the request that it has at the moment.

(2) If there is no entry in the cache that maps with the relevant request then it will be a cache miss.

(3) So the client will send the request to the server without any cache control headers.

(4) The server will process the request and send back the response with a unique identifier to that response (In most cases this will be a hash value of the response message context [MD5 message digest algorithm commonly used for generating the hash value.]). The unique identifier will be set in the HTTP header field 'ETag'. So the value is known as the ETag value.

(5) After the client receiving the response from the server it will save it and the ETag value in the client cache storage mapped with the relevant request.

Scenario 02:


(1) The client will search for a response in it's cache storage.

(2) And this time it finds a matching response to the request.

(3) Client will send the request to the server with the ETag value set up in the HTTP header field "If-None_Match".

(4) The server will process the request and generate the response. Then it will compute the hash value of the response.
    (i) If that value matches with the value in the request's If-None-Match field that means the                response has not modified since. So the server will send the response with the HTTP status          code 304 without a payload in the response.

    (ii) If that value does not match with the value in the request's If-None-Match field that means           the response has modified. So the server will send the updated response with it's new hash           value in the ETag field of the response.

(5) If the client received a message with status code 304 it can use the appropriate response from it's cache storage. And if it receives a message with status code 200 it has to update it's cache storage with the new values.  


Friday, March 14, 2014

Apache Bloodhound : Batch create tickets from wiki list


Abstract:

Apache Bloodhound is a software development collaboration tool, including issue tracking, wiki and repository browsing. A major part of issue tracking is creating tickets on issues. Currently Apache Bloodhound provides a method for creating tickets by filling a form with relevant information. But as it provides the functionality just for add one ticket a time, for a user who requires to create a large number of tickets could be really tedious process. So the major idea of this project is to implement a functionality which will make it possible for the users to batch-create tickets by converting WikiFormatted tables into set of tickets.

Idea


Built on top of Trac Apache Bloodhound is a software development collaboration tool, including issue tracking, wiki and repository browsing (see:http://bloodhound.apache.org). Bloodhound extends Trac (http://trac.edgewall.org/) with multiple product support, advanced search functionality, ticket relations, a simpler installer and sleeker user interface. Apache Bloodhound recently graduated from the Apache Incubator as a stand-alone Apache project.

Apache Bloodhound makes itself uniquely special by providing significantly great user experience. So while improving the core functionalities, maintaining and improving the current user experience is really important for the projects development. After completing this project it will boost Apache Bloodhound into a new better user experience level.

Apache Bloodhound already provides a method for the users to track issues by providing functionality of creating tickets. It is a dropdown form which contains the some of the relevant fields to be filled and saved in the backend database. Somehow if the user requires to fill one or more of the other fields that doesn't display in the dropdown he will be redirected into a new page which contains the complete form of relevant fields. If the user just wants add one ticket at a time this method will work fine. But if the user wants to a large number of tickets he has to go over and over again through this process which will be really  exhausting experience for the user. As I mentioned earlier one of the major power of Apache Bloodhound is it's great user experience. So it is really necessary to provide a separate method which will make this process easier for the users.
So the proposed idea is to provide functionality for the users to batch-create tickets by just converting an existing wiki table into tickets. That is the users will be able to create a 'ticket table' with a simple macro.

Format of the macro will be:
[[TicketTable(numberOfTickets)]]
(Figure 01)

Selection_026.png
Figure 01



As it conveys the macro takes an user argument "number of tickets" that the user is going to add to the ticket table. Then the macro will be able to render a table with required fields as column headers and "numberOfTickets" number of empty rows to be filled by users. After filling the table the users will be able to click on the save button which will be positioned under the ticket table and batch-create tickets. After that the wiki page will be modified, that is the table will be modified into a normal ticket query table where the ticket id and ticket summary will be links redirecting to appropriate tickets that have been created.(Figure 02 and Figure 03)Selection_030.png
Figure 02

Selection_033.png
Figure 03



When retrieving the tickets it is required to identify the tickets that have been created. For that the idea is to retrieve based on the creation time. For that it will be require to extend the TicketQuery macro to handle absolute time stamp values.(Figure 04)

[[Widget(TicketQuery, query="created=2007-01-01 00:00:00..2008-01-01 23:59:59", max=10, title=Batch Created Tickets)]]Selection_034.png
Figure 04

And obviously in an organization where they use Apache Bloodhound as there issue tracking tool they will be a need to control the use of this functionality. That is this functionality should not be given to each and every user in the organization. Because miss using this functionality will lead into a huge mess as it is capable of creating huge numbers of tickets by a single click of the user. So the idea is to create a new permission level for using this functionality. Admins of the organization will be the only ones who will granted with this permission level 'TICKET_BATCH_CREATE' by default, and they will be able to grant it to other users as they wish.

However the best thing about this feature is ,as this uses simple macros which can be used within wiki syntax, the users will be able to use this anywhere  where it supports wikiFormating. So hopefully this will be a real nice feature which will increase the usability of Apache Bloodhound significantly.


Time Line


Selection_024.png

Selection_025.png


About me

I’m Dammina Sahabandu a third year undergraduate at University of Moratuwa (Department of Computer Science and Engineering) which has had the most Google Summer of Code students for an impressive 7 years in a row. And currently I’m doing my internship at WSO2 which is completely open source middleware organization who has a really close relationship with Apache Software Foundation[2]. In WSO2 I’m working in the ESB(Enterprise Service Bus) team which is built upon Apache Synapse and Apache Axis 2. So I have great experience on working with open source community and have a good idea about their standards. And working with large code bases is my everyday work.
Also all the technologies related to this project are very much familiar to me. So I feel really confident and looking forward to complete this project successfully.


Saturday, March 8, 2014

Introduction to Apache Bloodhound - Installation

Briefly, Apache Bloodhound is a web based project management and bug tracking system written in "python". It is build on top of Trac (which is another open source web based project management and bug tracking system maintained by Edgewall Software) and it is Apache top level project developed and maintained by volunteer programmers at Apache Software Foundation

Here in this post I'm going to describe how to install and setup Apache Bloodhound in Ubuntu using bloodhound_setup.py python script. [For Windows users I found this video really helpful]

Installing Prerequisites


Install Python:


Download python from this page. Version should be between 2.6 and 3.0 (I have downloaded Python 2.7.6). Extract it to a preferred location.

tar -xJf Python-2.7.6.tar.xz
(Rename the filename appropriately)


Run following commands on the terminal to install python.(Some of them may require administrator access)

./configure 
make
make test
make install

Install Setuptools:


You just need to run the following command in the terminal.

wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -O - | sudo python

Install pip:


Run the command on terminal to install pip using package managers.

sudo apt-get install python-pip


Install virtualenv:

Run,

pip install virtualenv

to install virtual environment.(Administrator access may require)

Install Database:


Although Apache Bloodhound supports many database systems currently the bloodhound_setup.py installer currently only sets up either SQLite or PostgreSQL databases.

If you are planning to use Apache Bloodhound for large production it will be better to install PostgreSQL otherwise it will be really easier for you to go on with SQLite.

Install and setup SQLite:


Download an appropriate SQLite version from here (Download sqlite-autoconf-3080300.tar.gz from "Source Code" section). Extract it to a preferred location.

tar xvfz sqlite-autoconf-3080300.tar.gz

Run following commands to build and install SQLite.

cd sqlite-autoconf-3080300
./configure
make
make install

Installing Apache Bloodhound


If you have followed the steps without facing any issue, congratulations! you have installed all the required prerequisites successfully. Now the next step is to build the product.

First checkout Apache Bloodhound from Apache subversion server using following command.

svn co https://svn.apache.org/repos/asf/bloodhound/trunk bloodhound

Now run the following two commands to configure and setup the environment.

cd bloodhound/installer
virtualenv --system-site-packages bloodhound

To enable the virtual environment run the following command.

source ./bloodhound/bin/activate

Now you should be able to see that the shell has activated a virtual environment like follows.


Now to install the required python packages run,

pip install -r requirements-dev.txt

Now you are at the final stage. Run the following command to setup bloodhound.

python bloodhound_setup.py

Shell will ask the following question,

Answer n as you had installed SQLite. Then the shell will ask for an admin user name and password. Provide them as you prefer.

Now you are all set to run Bloodhound. Your final step is to run,

tracd ./bloodhound/environments/main --port=8000 

and check at,

http://localhost:8000/main/

Now you should be able to see the welcome page of Apache Bloodhound.




Enjoy!