Darwin Core
Introduction
Darwin Core – or DwC for short – is a group of standards designed for sharing biodiversity data. Developed by Biodiversity Information Standards (TDWG), it allows data owners to publish biodiversity information in a language (Darwin Core) and format (e.g. Darwin Core archives) that can be understood and used by everyone. From the Darwin Core website:
The Darwin Core is a body of standards. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing reference definitions, examples, and commentaries. The Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, and samples, and related information. Included are documents describing how these terms are managed, how the set of terms can be extended for new purposes, and how the terms can be used. The Simple Darwin Core is a specification for one particular way to use the terms – to share data about taxa and their occurrences in a simply structured way – and is probably what is meant if someone suggests to “format your data according to the Darwin Core”.
Documents
- Darwin Core website
- List of Darwin Core terms (also available as a Google Document)
- Canadensys – Darwin Core mapping, a list of Darwin Core terms used for all datasets published through the Canadensys network (useful for comparison).
- 7-step guide to data publication, guidelines on publishing biodiversity data in the Canadensys network.
- Blog posts about Darwin Core
- Apple Core, Darwin Core guidelines for herbaria.
- Darwin Core archives in the GBIF network.
- Darwin Core archive, technical details regarding Darwin Core as a zipped text file + metadata.
- Simple Darwin Core, technical details regarding Darwin Core as a flat file.
- Wieczorek, J. et al, 2011. Darwin Core: An evolving community-developed biodiversity data standard
- The Darwin Core Hour, a series of webinar, organized by iDigBio, around the Darwin Core standards.
Tools
- GBIF Integrated Publishing Toolkit (IPT), the most complete tool to generate, publish and register Darwin Core archives, used by GBIF and Canadensys.
- Darwin Core Archive Validator, a GBIF tool to validate your Darwin Core archive.
- Darwin Core Archive Assistant, a GBIF tool to generate the
meta.xml
file for your Darwin Core archive. - Darwin Core to SQL, a Canadensys Java tool to translate the structure and content of your Darwin Core archive into SQL.
- More GBIF tools
Mapping to Darwin Core
One of the first steps in publishing your data, is translating or “mapping” your data from its current format to Darwin Core. For example, you might have the field Collector in your database or spreadsheet, which corresponds to recordedBy in Darwin Core. In other cases it might not be so straightforward, which is why we offer support in mapping your data to Darwin Core. For more details, see step 4 of our publication guide or have a look at the list of Darwin Core terms used for other datasets published through the Canadensys network.
Should you use Darwin Core to design a database?
Darwin Core is designed to exchange biodiversity information, not to manage data. You should design your database/spreadsheet in a way that fits the needs of your collection in the first place, but the list of all Darwin Core terms or the ones used for other datasets in the Canadensys network might give you an idea of what fields you could include and how you could share it as Darwin Core later.