Imaging

Introduction

Imaging is the process of creating digital images of biological specimens.

Imaging labels

Photographing specimen labels is often used as a first step towards text digitization, which is still necessary as text on an image cannot be searched or analyzed. Text can be extracted automatically from the image with Optical Character Recognition (OCR), but this will always require human proofreading and structuring of the information, and it can be more time-consuming than keystroking. The main use of a label image is that it serves as a verbatim backup: publishing it in combination with the text information allows users to verify the information and report errors without having access to the physical specimen.

Imaging specimens

Photographing specimens results in one or more images of the whole specimen. This process is generally more complex than imaging labels, as the image should be useful for scientific research. Method (camera vs scanner), resolution, colour, light-conditions, format and storage are all factors to be considered. As a result, this type of imaging can become very time-consuming and unless automated, it should only be considered for the most scientifically valuable specimens, such as types.

Tools

  • SilverArchive, a tool from SilverBiology that combines OCR with human verification and streamlines the whole process as much as possible.

Documents

Imaging and Canadensys

Even though we are mainly focusing on text digitization, some collections are also imaging their specimens.

Please use the specimen number (in combination with an extra identifier for multiple images) for image filenames, such as “46912.jpg”, “45609_1.jpg” or “QMOR3090_dorsal.jpg”. Do not use species names or other human readable information in the file name as this information can change over time and is better stored in the collection database.

Imaging practices depend on the type of collection (for example, the difference between the two types of imaging is less of an issue for herbaria): see the documents for more specific information.