Monday, 2 May 2011

GBIF Data Portal

The current GBIF Data Portal was designed and implemented in 2005/2006, around the time I first joined the GBIF Secretariat in Copenhagen. As I am not a developer myself, but have been involved with the Data Portal for a long time, I thought I would take the opportunity to give a bit of a summary view of some of the components discussed in other posts here, looking at them more from the perspective of the Data Portal.

The GBIF Data Portal has been in operation more or less in its current form since mid 2007. From the time it was designed, the Portal's focus is on providing discovery of and access to primary species occurrence data (specimens in museums, observations in the field, culture strains and others). Since the launch, bug fixes and some minor changes were made, but development stopped due to new priorities. We did receive a lot of input on data content and functionality, though, both from data publishers and data users, and also through a number of reports and analyses.

Towards the end of 2010, a new development phase started, initiating version 2 of the GBIF Data Portal. This was the time to start taking care of all the known shortcomings and improvement requests, e.g. a more robust and reliable backbone taxonomy, improvement of data quality, better attribution of contributors, and others. However, this is not just a matter of adding some data or changing the user interface: a lot of those points first require considerable reworking of internal processing and workflows between the Data Portal and related components, blogged about in other contributions here:
  • quicker indexing and more frequent rollovers (publication cycles) from the non-public indexing database to the public web portal can only be achieved through a complete re-working of the rollover processing workflow.
  • a reliable taxonomic backbone required a review and re-implementation of name parsing routines, integrating lookup services, and following that, a complete regeneration of the taxonomic backbone
  • the demand for better attribution of data owners and service providers can only be met after having moved on to a new registry, better modelling the GBIF network structure, players and interactions. This is especially the case where datasets are aggregated or hosted, and both the owning and the service providing institution need to receive proper credit for their contributions
  • extended and improved metadata are needed to assess suitability of a dataset for specific applications (e.g. modelling), and to allow discovery of collections that are not digitised or not published
In 2011, GBIF Data Portal development focuses on consolidating and integrating these re-worked components, and on including both names (checklist) and metadata sources into the search functionality. The implied changes on the Portal user interface side are quite fundamental. With other known and future requirements on user interface functionality, the time has now come to replace the old Portal code base. At present, we are working with an external team to develop wireframes for key Portal pages, based on functionality requests from GBIFS regarding the integration of the new data areas and following evaluation of a number of sources (task group reports, reviews, participant reports etc). Those wireframes will aid further discussions on functionality starting from July, and also build the basis for implementation in 2011 and after. Once there is a public version available to look at, we will give an update.


  1. That's a very good news, and I'm glad to hear you're working with Vizzuality on this matter, they always do a great job IMHO.

    Hope there will be a few "public prototypes" or wireframes of this, UI stuff is very hard to discuss without getting your hands on it :)

  2. There will be within a few weeks, and we will do all we can do develop them publicly in a dev environment to ensure we get as much real user feedback as possible.