Monday, 23 May 2011

2011 GBIF Registry Refactoring

For the past couple of months, I have been working closely with another GBIF developer (and also fellow blog writer) Federico Mendez, on development tasks on the GBIF's Registry application. This post provides an overview of the work being done on this matter.

First, I will like to explain the nuts and bolts of the current Registry application (the one online), and then the additions/modifications it has "suffered" during 2011 (modifications have not been deployed). As stated on The evolution of the GBIF Registry blog post, in 2010 the Registry entered a new stage on its development by moving to a single DB,  enhanced web service API, and a web user interface. On top of this, an admin-only web interface was created so that we could do internal curation of the data inside the Secretariat.

Hibernate's framework was chosen as the preferred persistence framework and the Data-Access-Object (DAO) classes were coded with the HQL necessary to provide an interface to the Hibernate persistence mechanism. The Business tier consisted of several Manager classes that relied on the DAOs to get the required data. These Managers also were the ones responsible for populating the Data-Transfer-Objects (DTOs) so that they could be passed to the Presentation tier. This last tier made use of plain Java Server Pages (JSPs), along with JQuery, Ajax, CSS among others. Then, at the start of this year 2011, a decision was made to improve the application's underlying implementation in some aspects:

  1. Use of MyBatis data mapper framework. This involved walking away from Hibernate's Object-Relational Mapping (ORM) approach. Our use of Hibernate involved HQL, adding an extra latency component when converting HQL to SQL, but in MyBatis we use direct SQL mapped statements making it quicker to access the DB. (I will share some benchmarking on my next blog post, to justify this remark)

  2. We found out that using a DTO pattern represented somewhat of an overkill for an application that didn't had such complexity at the model level. We could trim some code complexity by passing the model objects straight to the presentation tier. So we did, and all DTOFactories & DTO objects were gone. 

  3. Several codebase improvements were introduced mainly by Federico, cutting down huge amounts of lines and making it easier to add new functionality with less effort (e.g. heavy use of Java's generics) 

  4. At the web service level, the Struts2 Rest plugin was replaced by the Jersey library. I personally found the Struts2 Rest plugin lacking documentation (1 year ago) so the Registry's use of it was kind of ad hoc. My next blog post will include more reasoning about this decision.

  5. We now make use of the Guice dependency injection framework. Beforehand, we were making use of Spring's ability for this. Also, these injections are made through annotations now; with Spring we were using XML based injection. 

  6. The Registry project is now divided into different libraries. In particular: 
    • registry-core: Business & persistence logic
    • registry-web: All related to the web application (Struts2)
    • registry-ws: All the web service stuff
    • There are also some libraries Federico has created to manage the interaction between the Registry and all technical installations (DiGIR, Tapir, BioCase, etc) of those publishers sharing data with GBIF. These are extremely important libraries as they are the ones who keep the Registry up to date.
(2011 refactoring)

I must emphasize again that these changes are not yet deployed, this in an ongoing project but if you are really interested to see the progress being made, please feel free to visit the project's site. Also, these changes won't affect the current web services API, or the DB structure. Merely the changes are to improve the underlying codebase. 


  1. Thanks for this post Jose, very nice.

    And when you put it in perspective with the other post about the registry, it's nice to see the evolution from a huge pile of "J2EE solutions" with 327 abstraction layers to a simple development that really fits the specific needs... I'm sure it's the best way to go. Simple is beautiful :)

    REWORK (, by the creators of Ruby on Rails is a very pragmatic, inspiring and refreshing must-read about that kind of approach !

  2. Thanks for the comments. I'm quite happy to know the post was informative. I agree, the simpler, the better. It just does everybody's job a bit easier.

    I read the REWORK excerpt, quite refreshing indeed. I guess it won't hurt getting the book !