UAM:Mamm:11470 - Eumetopias jubatus - skull |
When requested for the first time, GBIF transiently caches the original images and processes them into various standard sizes and formats suitable for the use in the portal.
Publishing multimedia metadata
GBIF indexes multimedia metadata published in different ways within the GBIF network. From a simple URL given as an additional field in Darwin Core via multiple items expressed as ABCD XML or a dedicated multimedia extension in Darwin Core archives the difference usually is in metadata expressiveness.Simple Darwin Core
Melocactus intortus record in iNaturalist |
As you can see on the right every extracted link is regarded as a separate media item as there is no standard way to detect that 2 links refer to the same item. In the example above every image has a link to the actual image file and another one to the respective html page where it's metadata is presented. There is also no way to specify additional metadata about a link. As a consequence all images based on dwc:associatedMedia do not have a title, license or any further information. The verbatim data for that record before we extract image links can be seen here: http://www.gbif-uat.org/occurrence/891030819/verbatim
Darwin Core archive multimedia extension
By having a dedicated extension for media items many media items per core occurrence record can be published in a structured way. This is the GBIF recommended way to publish multimedia as it gives you most control over your metadata. Note that the same extension can also be used to publish multimedia for species in checklist datasets. This extension, based entirely on existing Dublin Core terms, allows you to specify the following information about a media item, all of which will make it into the GBIF portal if provided:- dc:type, the kind of media item based on the DCMI Type Vocabulary: StillImage, MovingImage or Sound
- dc:format, MIME type of the multimedia object's format
- dc:identifier, the public URL that identifies and locates the media file directly, not the html page it might be shown on
- dc:references, the URL of an html webpage that shows the media item or its metadata. It is recommended to provide this url even if a media file exists as it will be used for linking out
- dc:title, the media items title
- dc:description, a textual description of the content of the media item
- dc:created, the date and time this media item was taken
- dc:creator, the person that took the image, recorded the video or sound
- dc:contributor, any contributor in addition to the creator that helped in recording the media item
- dc:publisher, the name of an entity responsible for making the image available
- dc:audience, a class or description for whom the image is intended or useful
- dc:source, a reference to the source the media item was derived or taken from. For example a book from which an image was scanned or the original provider of a photo/graphic, such as photography agencies
- dc:license, license for this media object. If possible declare it as CC0 to ensure greatest use
- dc:rightsHolder, the person or organization owning or managing rights over the media item
Access to Biological Collections Data
As usual we also provide a binding from the TDWG ABCD standard (versions 1.2 and 2.06) mostly used with the BioCASE software.From ABCD 1.2 we extract media information based on the UnitDigitalImage subelements. In particular information about the file URL (ImageURI), the description (Comment) and the license (TermsOfUse).
In ABCD 2.06 we use the unit MultiMediaObject subelements instead. Here there are distinct file and webpage URLs (FileURI, ProductURI), the description (Comment), the license (License/Text, TermsOfUseStatements) and also an indication of the mime type (Format). The bird sound example from above comes in as ABCD 2.06 via the Animal Sound Archive dataset. You can see the original details of that ABCD record in it's raw XML fragment. There are also fossil images available through ABCD.
Missing from both ABCD versions is a media title, creator and created element.
Interesting discussion on implementing this for the Canadensys network: https://groups.google.com/forum/#!topic/canadensys/__oECtImVqs
ReplyDeleteNote that Darwin Core metadata can be directly imbedded in multimedia files as XMP. Exiftool implemented this in early 2013 and it will be implemented in exiv2 starting with version 0.25.
ReplyDelete