MorphoCluster: Efficient Annotation of Plankton Images by Clustering

In this work, we present MorphoCluster, a software tool for data-driven, fast, and accurate annotation of large image data sets. While already having surpassed the annotation rate of human experts, volume and complexity of marine data will continue to increase in the coming years. Still, this data r...

Full description

Bibliographic Details
Main Authors: Simon-Martin Schröder, Rainer Kiko, Reinhard Koch
Format: Article
Language:English
Published: MDPI AG 2020-05-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/20/11/3060
id doaj-a50ff817ecdd41898201aaa9731d8938
record_format Article
spelling doaj-a50ff817ecdd41898201aaa9731d89382020-11-25T02:59:30ZengMDPI AGSensors1424-82202020-05-01203060306010.3390/s20113060MorphoCluster: Efficient Annotation of Plankton Images by ClusteringSimon-Martin Schröder0Rainer Kiko1Reinhard Koch2Department of Computer Science, Kiel University, 24118 Kiel, GermanyLaboratoire d’Océanographie de Villefranche-sur-mer, 06230 Villefranche-sur-Mer, FranceDepartment of Computer Science, Kiel University, 24118 Kiel, GermanyIn this work, we present MorphoCluster, a software tool for data-driven, fast, and accurate annotation of large image data sets. While already having surpassed the annotation rate of human experts, volume and complexity of marine data will continue to increase in the coming years. Still, this data requires interpretation. MorphoCluster augments the human ability to discover patterns and perform object classification in large amounts of data by embedding unsupervised clustering in an interactive process. By aggregating similar images into clusters, our novel approach to image annotation increases consistency, multiplies the throughput of an annotator, and allows experts to adapt the granularity of their sorting scheme to the structure in the data. By sorting a set of 1.2 M objects into 280 data-driven classes in 71 h (16 k objects per hour), with 90 of these classes having a precision of 0.88888888 or higher. This shows that MorphoCluster is at the same time fast, accurate, and consistent; provides a fine-grained and data-driven classification; and enables novelty detection.https://www.mdpi.com/1424-8220/20/11/3060machine learningdeep learningclusteringplankton image classificationmarine image recognitionmarine image annotation
collection DOAJ
language English
format Article
sources DOAJ
author Simon-Martin Schröder
Rainer Kiko
Reinhard Koch
spellingShingle Simon-Martin Schröder
Rainer Kiko
Reinhard Koch
MorphoCluster: Efficient Annotation of Plankton Images by Clustering
Sensors
machine learning
deep learning
clustering
plankton image classification
marine image recognition
marine image annotation
author_facet Simon-Martin Schröder
Rainer Kiko
Reinhard Koch
author_sort Simon-Martin Schröder
title MorphoCluster: Efficient Annotation of Plankton Images by Clustering
title_short MorphoCluster: Efficient Annotation of Plankton Images by Clustering
title_full MorphoCluster: Efficient Annotation of Plankton Images by Clustering
title_fullStr MorphoCluster: Efficient Annotation of Plankton Images by Clustering
title_full_unstemmed MorphoCluster: Efficient Annotation of Plankton Images by Clustering
title_sort morphocluster: efficient annotation of plankton images by clustering
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2020-05-01
description In this work, we present MorphoCluster, a software tool for data-driven, fast, and accurate annotation of large image data sets. While already having surpassed the annotation rate of human experts, volume and complexity of marine data will continue to increase in the coming years. Still, this data requires interpretation. MorphoCluster augments the human ability to discover patterns and perform object classification in large amounts of data by embedding unsupervised clustering in an interactive process. By aggregating similar images into clusters, our novel approach to image annotation increases consistency, multiplies the throughput of an annotator, and allows experts to adapt the granularity of their sorting scheme to the structure in the data. By sorting a set of 1.2 M objects into 280 data-driven classes in 71 h (16 k objects per hour), with 90 of these classes having a precision of 0.88888888 or higher. This shows that MorphoCluster is at the same time fast, accurate, and consistent; provides a fine-grained and data-driven classification; and enables novelty detection.
topic machine learning
deep learning
clustering
plankton image classification
marine image recognition
marine image annotation
url https://www.mdpi.com/1424-8220/20/11/3060
work_keys_str_mv AT simonmartinschroder morphoclusterefficientannotationofplanktonimagesbyclustering
AT rainerkiko morphoclusterefficientannotationofplanktonimagesbyclustering
AT reinhardkoch morphoclusterefficientannotationofplanktonimagesbyclustering
_version_ 1724701972981874688