New source datasets
Apart from continuously updated source like the Catalog of Life or WoRMS here are the new datasets we used as a source to build the backbone.- New Type specimen checklist listing all distinct names of type specimens found in GBIF occurrences contributing 252,410 new species and 57,410 infra specific names.
- ZooBank joined GBIF and was added as a nomenclator with 175,775 names, contributing 3460 new generic and 39,695 new species names.
- Added phylum Myzozoa with 136 families under kingdom Chromista to GBIF Algae Classification to fill the classification gap for Dinoflagellates
- Tiny new dataset listing species named after famous people and which are often found in news
The 43 sources used in this backbone build
Code changes
- Merging of duplicate taxa across kingdoms, especially with taxa from the incertae sedis kingdom. Examples
- Exclude genus & species synonyms for taxa at a higher rank: http://dev.gbif.org/issues/browse/POR-3169
- Restrict name normalisation with double letters to bi/trinomials. Finally the fish Lota lota is a fish again. Examples of other previously wrongly conflated families that have been reported:
- Stable identifier for pro parte taxa in the backbone.
All other fixed issues in the source code that generates the backbone can be found in our Jira epic
and github milestone.
Backbone impact
The new backbone has a total of 5,887,500 names of which it treats 2,818,534 species names as accepted (up from 5,307,978 and 2,525,274 respectively).More backbone metrics are available through our portal and in more detail through our API.
- 105,296 deleted names, many of them previous erroneous duplicates
- 685,853 new names
- Animalia: 164 families; 6,616 genera; 257,196 species; 87,660 infraspecific
- Archaea: 2 families; 6 genera; 48 species
- Bacteria: 27 families; 225 genera; 2,470 species; 615 infraspecific
- Chromista: 2 phyla; 13 classes; 58 order; 54 families; 767 genera; 12,124 species; 2,953 infraspecific
- Fungi: 2 families; 269 genera; 8,703 species; 2,993 infraspecific
- Plantae: 3 families; 795 genera; 63,617 species; 33,282 infraspecific
- Protozoa: 4 families; 65 genera; 1,412 species; 280 infraspecific
- Viruses: 8 families; 1,227 genera; 8,488 species
- Unknown: 4 families; 2,708 genera; 13,076 species; 2,237 infraspecific
A very large and detailed log of the backbone build is also available.
The largest taxonomic groups in the backbone, exceeding 3% of all accepted species is shown in the following diagram:
All contributors to the backbone arranged by number of names the source serves as the primary reference:
- 3,330,535 Catalogue of Life
- 685,831 Interim Register of Marine and Nonmarine Genera
- 312,746 World Register of Marine Species
- 309,820 GBIF Type Specimen Names
- 285,859 The Plant List with literature
- 140,937 Fauna Europaea
- 136,981 Index Fungorum
- 126,960 The Paleobiology Database
- 114,089 International Plant Names Index
- 53,848 Integrated Taxonomic Information System ITIS
- 44,732 ZooBank
- 30,482 GRIN Taxonomy
- 29,267 Plazi
- 25,749 Artsnavnebasen
- 24,996 Afromoths
- 15,007 Species Files
- 13,818 Brazilian Flora 2020 project
- 8,923 Dyntaxa
- 6,807 DiversityTaxonNames Lists
- 5,696 Official Lists and Indexes of Names in Zoology
- 5,317 Prokaryotic Nomenclature Up-to-date
- 4,617 International Cichorieae Network ICN
- 4,611 Catalogue of Afrotropical Bees
- 4,416 Database of Vascular Plants of Canada
- 4,312 ICTV Master Species List
- 3,874 The Clements Checklist
- 2,702 Checklist of Beetles Coleoptera of Canada and Alaska
- 1,198 IOC World Bird List, v6.3
- 1,087 GBIF Algae Classification
- 578 ION Taxonomic Hierarchy
- 272 Mammal Species of the World
- 144 GBIF Backbone Patch
- 39 Species named after famous people
- 36 True Fruit Flies Diptera, Tephritidae of the Afrotropical Region
- 7 Backbone Family Classification Patch
- 7 TAXREF
Occurrence impact
With a new backbone we have reprocessed all of our 712 million occurrences.The distribution of the major taxonomic groups exceeding 3%, i.e have a minimum of 36.800 species, is shown in this last diagram:
The 1,226,520 accepted species in GBIF occurrences (140 less than before) represent 44% of all accepted backbone species.