Tuesday, 27 September 2011

Portal v2 - There will be cake

The current GBIF data portal was started in 2007 to provide access to the network's biodiversity data - at the time that meant a federated search across 220 providers and 76 million occurrence records. While that approach has served us well over the years, there are many features that have been requested for the portal that weren't addressable in the current architecture. Combined with the fact that we're now well over 300 million occurrence records, with millions of new taxonomic records to boot, it becomes clear that a new portal is needed. After a long consultation process with the wider community the initial requirements of a new portal have been determined, and I'm pleased to report that work has officially started on its design and development.

For the last 6 months or so the development team has been working on improving our rollover process, registry improvements, IPT development, and disparate other tasks. The new portal marks an important milestone in our team development as we're now all working on the portal, with as little distraction from other projects as we can manage. Obviously we're still fixing critical bugs and responding to data requests, etc, but all of us focusing on the same general task has already shown dividends in the conversations coming out of our daily scrums. Everyone being on the same page really does help.

And yes, we've been using daily stand-up meetings that we call "scrums" for several months, but the new portal marks the start of our first proper attempt at agile software development, including the proper use of scrum. Most of our team has had some experience with parts of agile techniques, so we're combining the best practices that everyone has had to make the best system for us. Obviously the ideal of interchangeable people with no single expert in a given domain is rather hard for us when Tim, Markus, Kyle and Jose have worked on these things for so long and people like Lars, Federico and I are still relatively new (even though we're celebrating our one year anniversaries at GBIF in the next weeks!), but we're trying hard to have non-experts working with experts to share the knowledge.

In terms of managing the process, I (Oliver) am acting as Scrum Master and project lead. Andrea Hahn has worked hard at gathering our initial requirements, turning them into stories, and leading the wireframing of the new portal. As such she'll be acting as a Stakeholder to the project and help us set priorities. As the underlying infrastructure gets built and the process continues I'm sure we'll be involving more people in the prioritization process, but for now our plates are certainly full with "plumbing". At Tim's suggestion we're using Basecamp to manage our backlog, active stories, and sprints, following the example from these guys. Our first kickoff revealed some weaknesses in mapping Basecamp to agile, and the lack of a physical storyboard makes it hard to see the big picture, but we'll start with this and re-evaluate in a little while - certainly it's more important to get the process started and determine our actual needs rather than playing with different tools in some kind of abstract evaluation process. Once we've ironed out the process and settled on our tools we'll also make them more visible to the outside world.

We're only now coming up on the end of our first, 2 week sprint, so it will take a few more iterations to really get into the flow, but so far so good, and I'll report back on our experience in a future post.

(If you didn't get it, apologies for the cake reference)


  1. I've always used Pivotal Tracker, with success. In fact, on one job where we used Basecamp for other purposes, we still used Tracker for story/iteration management.

  2. Thanks for the suggestion Mark - I'll add it to the list for our eventual basecamp-replacement shootout.