Automated taxonomic classification of phytoplankton sampled with imaging in-flow cytometry
Limnol. Oceanogr. Methods 5:204-216 (2007) | DOI: 10.4319/lom.2007.5.204
ABSTRACT: High-resolution photomicrographs of phytoplankton cells and chains can now be acquired with imagingin- flow systems at rates that make manual identification impractical for many applications. To address the challenge for automated taxonomic identification of images generated by our custom-built submersible Imaging FlowCytobot, we developed an approach that relies on extraction of image features, which are then presented to a machine learning algorithm for classification. Our approach uses a combination of image feature types including size, shape, symmetry, and texture characteristics, plus orientation invariant moments, diffraction pattern sampling, and co-occurrence matrix statistics. Some of these features required preprocessing with image analysis techniques including edge detection after phase congruency calculations, morphological operations, boundary representation and simplification, and rotation. For the machine learning strategy, we developed an approach that combines a feature selection algorithm and use of a support vector machine specified with a rigorous parameter selection and training approach. After training, a 22-category classifier provides 88% overall accuracy for an independent test set, with individual category accuracies ranging from 68% to 99%. We demonstrate application of this classifier to a nearly uninterrupted 2-month time series of images acquired in Woods Hole Harbor, including use of statistical error correction to derive quantitative concentration estimates, which are shown to be unbiased with respect to manual estimates for random subsamples. Our approach, which provides taxonomically resolved estimates of phytoplankton abundance with fine temporal resolution (hours for many species), permits access to scales of variability from tidal to seasonal and longer.