Comparing Class-Aware and Pairwise Loss Functions for Deep Metric Learning in Wildlife Re-Identification

Similarity learning using deep convolutional neural networks has been applied extensively in solving computer vision problems. This attraction is supported by its success in one-shot and zero-shot classification applications. The advances in similarity learning are essential for smaller datasets or...

Full description

Bibliographic Details
Main Authors: Nkosikhona Dlamini, Terence L. van Zyl
Format: Article
Language:English
Published: MDPI AG 2021-09-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/18/6109
Description
Summary:Similarity learning using deep convolutional neural networks has been applied extensively in solving computer vision problems. This attraction is supported by its success in one-shot and zero-shot classification applications. The advances in similarity learning are essential for smaller datasets or datasets in which few class labels exist per class such as wildlife re-identification. Improving the performance of similarity learning models comes with developing new sampling techniques and designing loss functions better suited to training similarity in neural networks. However, the impact of these advances is tested on larger datasets, with limited attention given to smaller imbalanced datasets such as those found in unique wildlife re-identification. To this end, we test the advances in loss functions for similarity learning on several animal re-identification tasks. We add two new public datasets, Nyala and Lions, to the challenge of animal re-identification. Our results are state of the art on all public datasets tested except Pandas. The achieved Top-1 Recall is <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>94.8</mn></mrow></semantics></math></inline-formula>% on the Zebra dataset, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>72.3</mn></mrow></semantics></math></inline-formula>% on the Nyala dataset, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>79.7</mn></mrow></semantics></math></inline-formula>% on the Chimps dataset and, on the Tiger dataset, it is <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>88.9</mn></mrow></semantics></math></inline-formula>%. For the Lion dataset, we set a new benchmark at <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>94.8</mn></mrow></semantics></math></inline-formula>%. We find that the best performing loss function across all datasets is generally the triplet loss; however, there is only a marginal improvement compared to the performance achieved by Proxy-NCA models. We demonstrate that no single neural network architecture combined with a loss function is best suited for all datasets, although VGG-11 may be the most robust first choice. Our results highlight the need for broader experimentation and exploration of loss functions and neural network architecture for the more challenging task, over classical benchmarks, of wildlife re-identification.
ISSN:1424-8220