Friday 27 July 2018

How popular is your favorite species?

How to use

Use the box to the left to type in the species you are interested in.
Make sure to use a scientific name:
  • Aves instead of birds
  • Plantae instead of plants
  • Anura  instead of frogs

Explanation of tool

This tool plots the downloads through time for species or other taxonomic groups with more than 25 downloads at GBIF. Downloads at GBIF most often occur through the web interface. In a previous post, we saw that most users are downloading data from GBIF via filtering by scientific name (aka Taxon Key). Since the GBIF index currently sits at over 1 billion records (a 400+GB csv), most users will simply filter by their taxonomic group of interest and then generate a download.

How to bookmark a result?

If you would like to bookmark a result or graph to share with others, you can visit app page direcly: app link. On this page the state of the app will be saved inside the url. You can also save a jpg by clicking on the little sandwich in the top right.

What counts as a download?

For the graphs above, I decided that it would be more meaningful to roll up downloads below the queried taxonomic level.
  • If a user downloaded 5 different bird species at once, this would count as 1 download for Aves and 1 download for each of the species downloaded.
  • If a user only typed in Aves in the occurrence download interface and not any other species. This would only count as 1 download for Aves and 0 downloads for all bird species.
  • Similarly, if a user only typed the order Passeriformes into the search, this would count as 1 download for Passeriformes and 1 download for Aves (and 1 download for Animalia ect.) but 0 downloads for all the species, families, and genera within Passeriformes.
It is possible, but not as easy, to get data from GBIF without generating a download. In fact users can stream data using the GBIF occurrence api without ever generating a download. Currently users can “download” 200k-long chunks of occurrence data without generating a download by using the api. If someone got their data using the api in this way, we would not be able to track it currently. Presumably, the vast majority of users are getting their data directly through the web interface.

For more technical details on this tool, you can visit my personal blog: