Last summer, 36 volunteers inventoried all of the 22,000+ vascular plant specimen folders of our Marie-Victorin Herbarium (MT), in preparation of its move to the Université de Montréal Biodiversity Centre. I gave a talk about the process, results and potential two weeks ago at the TDWG 2011 Annual Conference in New Orleans:

You can read the abstract for the contributed talk here.

Useful for the herbarium

The take-home message is that this experiment has paid off really well. Thanks to a great group of volunteers, we were able to collect very useful quantitative metadata for the whole vascular plant collection, with a limited budget (5740$, all staff salary) and in a short time (158 work days, including all the post-counting processing). At 110 specimens per dollar, this was a hundred times cheaper than full digitization (1$/specimen is a number that has been floating around for a number of years).

With the data, curator Luc Brouillet was able to reorganize the herbarium taxonomically, following Christenhusz et al. (2011a) for lycophytes, Smith et al. (2006) for ferns, Christenhusz et al. (2011b) for gymnosperms and APG III (2009) for flowering plants (the same classification as in VASCAN). All folders have now been assigned with a new case/tray number, which will help us tremendously in organizing the move.

With the data, we now also have a much more detailed overview of the collection:

  • 628,664 specimens, which is lower than previously estimated. 21.5% are fully digitized.
  • 380 families: 82% of all known families
  • 5,298 genera
  • 6 continents. North America is further divided in Canada, US and Central America. We also have a category for cultivated specimens.

Since we counted the specimens per folder, which is a combination of a case, tray, region, genus and family, we can also calculate their distribution per variable:

Or answer questions like: “How many Rubus specimens do we have from Canada?” Answer: 2921, located in trays A236-07 and A238-04.

Useful for others

We think our metadata is useful for others as well, which is why we published it online:

  • As a Google Fusion Table, allowing you to filter, aggregate and visualize the data very quickly, exactly like the embedded pie-chart above.
  • As a Darwin Core Archive on the Canadensys repository. Using Darwin Core to express this kind of metadata is a bit experimental, which is why the dataset is not registered with GBIF, but in my opinion it works pretty well. The only term I was missing is one for the folder’s location at the herbarium: caseTrayNumber, which is now shared in dynamicProperties. The big advantage of using a Darwin Core Archive is that I can not only share the dataset, but also the purpose and process behind it (which you can read and download on the IPT page) and in a standardized format (EML); something that is not possible with Google Fusion Tables.

Having the full inventory of the herbarium online will definitely help taxonomists who are interested in loans, but it might also attract the attention of other biodiversity researchers. It could even spark demand driven digitization or set priorities for digitization according to the needs of users outside the taxonomic community (see Berendsohn & Seltmann, 2011), although the granularity of the data (genus, continent) might be too coarse. But at least it’s a first step towards some real numbers, and if we extrapolate our experiment to all 350,000,000 herbarium specimens worldwide (Index Herbariorium) it would “only” cost 3,200,000$ to get a geotaxonomic index for all herbarium specimens! We have updated our page on the Biodiversity Collections Index and Index Herbariorum. So can you!