Getting students involved in biodiversity informatics

This is a guest post by Jana Vamosi, the Director of the University of Calgary Herbarium.

The University of Calgary got involved in Canadensys this year (see also the IPT resource webpage), through a project designed for students to learn about species distribution modeling.

We wanted students to understand all components of species distribution modeling, including how occurrence data is compiled. Doing so has resulted in the University of Calgary’s first data publication of plant specimens in the Canadensys repository.

The University of Calgary Herbarium has over 90,000 holdings (curated by Bonnie Smith), yet only a fraction has had the label data transcribed into a database. That fraction covers the Lycophytes and Monilophytes, most of which come from Alberta. Monilophytes and Lycophytes include the ferns and club mosses and represent important components of terrestrial ecosystems .

Despite substantial holdings, the University of Calgary database has sadly never seen the light of day in a digital online repository and we sought to change that. Therefore, students embarked on determining the latitudinal and longitudinal co-ordinates for the collection sites of over 1,600 fern and club moss specimens that are stored in the Department of Biological Sciences’ herbarium.

Steps involved in turning biodiversity informatics into an experiential learning experience

This mini-project was a module, designed to be completed within three weeks. We found the following set-up successful:

  1. Provide students with a database template. Beth Dickson, a local botanist with geospatial experience, was able to provide the students with a ready-made colour-coded database template that highlighted the cells students were to populate.
  2. We provided several examples of what sort of georeferencing standards they were likely to encounter and guidelines on how to deal with each one. This protocol of good strategies and practices for getting as precise a locality as possible gave students the confidence to proceed independently in most cases.
  3. We arranged the students into pairs for peer support. We then divided up the dataset into equal parts of labour for this task (~1300 records divided amongst 9 groups for ~140 records per pair). Each pair was given a separate database with only their 140 records to georeference to avoid errors carrying through the entire dataset and circumvent any potential for sabotage between groups (rare and unfortunate, but sadly not unheard of).
  4. Once done with georeferencing, students worked through the steps on how to publish the dataset to Canadensys can be found at this tutorial: learning the basics of concepts behind terms such as “metadata” and “Darwin Core Mapping”. The learning curve was steep for some but we were ultimately successful, thanks to the help of great people at Canadensys (thanks David Shorthouse!).

Feedback on the project was very positive. One student commented that the project marked a milestone in his degree where he became a producer of scientific knowledge. One fellow faculty member was impressed with our progress and said, “It is very good to see our undergraduates being involved in front-line data compilation. Our undergraduates are a great potential partner in the scientific enterprise, but their abilities generally go unnoticed and untested. A splendid team effort. Well done.”

The 1,600 specimens these students worked with represent only one or two per cent of the entire collection. We still have a lot of valuable information to contribute to these databases and we hope to continue these efforts.