Pollen analysis is an important tool in many fields, including pollination ecology, paleoclimatology, paleoecology, honey quality control, and even medicine and forensics. However, labour‐intensive manual pollen analysis often constrains the number of samples processed or the number of pollen analysed per sample. Thus, there is a desire to develop reliable, high‐throughput, automated systems.
We present an automated method for pollen analysis, based on deep learning convolutional neural networks (CNN). We scanned microscope slides with fuchsine stained, fresh pollen and automatically extracted images of all individual pollen grains. CNN models were trained on reference samples (122,000 pollen grains, from 347 flowers of 83 species of 17 families). The models were used to classify images of different pollen grains in a series of experiments. We also propose an adjustment to reduce overestimation of sample diversity in cases where samples are likely to contain few species.
Accuracy of a model for 83 species was 0.98 when all samples of each species were first pooled, and then split into a training and a validation set (splitting experiment). However, accuracy was much lower (0.41) when individual reference samples from different flowers were kept separate, and one such sample was used for validation of models trained on remaining samples of the species (leave‐one‐out experiment). We therefore combined species into 28 pollen types where a new leave‐one‐out experiment revealed an overall accuracy of 0.68, and recall rates >0.90 in most pollen types. When validating against 63,650 manually identified pollen grains from 370 bumblebee samples, we obtained an accuracy of 0.79, but our adjustment procedure increased this to 0.85.
Validation through splitting experiments may overestimate robustness of CNN pollen analysis in new contexts (samples). Nevertheless, our method has the potential to allow large quantities of real pollen data to be analysed with reasonable accuracy. Although compiling pollen reference libraries is time‐consuming, this is simplified by our method, and can lead to widely accessible and shareable resources for pollen analysis.