Occurrences at GBIF are often downloaded through the
web interface, or through the api (via rgbif ect.). Users can place various filters on the data in order to limit the number of records returned. As the occurrence index is currently a 447 GB csv, most users want to use a filter.
Total monthly downloads
Here I plot the total monthly downloads for various popular filters. For the past few years, GBIF has be averaging around
10k downloads per month.
Two peaks in total downloads stand out:
The
Sep 2016 peak seems to be explained by high
DATASET_KEY downloads. Both the
Mar 2014 and
Sep 2016 peaks are well explained by the
top users. Top users in this graph are all the downloads generated by the
top 3 most active users on GBIF. These users generate downloads in the 1000s and are most likely to be automated downloads generated internally.
One interesting detail is that while
No Filter Used is not used very often it accounts for more than
500 billion occurrence records downloaded.
Finally, if we look at the
number of unique users (un-select everything else to see in isolation), we see that
the number of individuals making downloads on GBIF has been increasing steadily with some perhaps interesting cyclical patterns. The graph below is
interactive. You can see different data views by clicking on the names.
Popular filters explained
There are many ways that a user can filter data. The types and combinations of filters are almost limitless. Below I describe some of the
most common filters:
1. TAXON_KEY
This is one of the most common filters users place on the GBIF occurrence index. Users can either choose
one or
many taxon names to filter the data, and users can choose any taxon rank they want (species, genus, family, kingdom ect.).
2. COUNTRY
Here users can return records only from a certain country. This is the country the user searched and
not where user is searching from.
3. HAS_GEOSPATIAL_ISSUE
Here users can specify that they want occurrence records
without some interpreted error.
4. HAS_COORDINATE
Here users can say that they want occurrence records that
have coordinates.
5. No Filter
Finally, a surprising number of users never put any filter and instead request to download the
entire occurrence index. In the overwhelming majority of cases, we have to assume these users have done this
by mistake.
You can read more about downloads at GBIF here:
http://www.johnwalleranalytics.org/2018/05/30/gbif-download-statistics/