Foreground Objects Detection by U-Net with Multiple Difference Images

In video surveillance, robust detection of foreground objects is usually done by subtracting a background model from the current image. Most traditional approaches use a statistical method to model the background image. Recently, deep learning has also been widely used to detect foreground objects i...

Full description

Bibliographic Details
Main Authors: Jae-Yeul Kim, Jong-Eun Ha
Format: Article
Language:English
Published: MDPI AG 2021-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/4/1807
id doaj-9f5bfc7da5864194902580bb9aaf0329
record_format Article
spelling doaj-9f5bfc7da5864194902580bb9aaf03292021-02-19T00:04:17ZengMDPI AGApplied Sciences2076-34172021-02-01111807180710.3390/app11041807Foreground Objects Detection by U-Net with Multiple Difference ImagesJae-Yeul Kim0Jong-Eun Ha1Graduate School of Automotive Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaDepartment of Mechanical and Automotive Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaIn video surveillance, robust detection of foreground objects is usually done by subtracting a background model from the current image. Most traditional approaches use a statistical method to model the background image. Recently, deep learning has also been widely used to detect foreground objects in video surveillance. It shows dramatic improvement compared to the traditional approaches. It is trained through supervised learning, which requires training samples with pixel-level assignment. It requires a huge amount of time and is high cost, while traditional algorithms operate unsupervised and do not require training samples. Additionally, deep learning-based algorithms lack generalization power. They operate well on scenes that are similar to the training conditions, but they do not operate well on scenes that deviate from the training conditions. In this paper, we present a new method to detect foreground objects in video surveillance using multiple difference images as the input of convolutional neural networks, which guarantees improved generalization power compared to current deep learning-based methods. First, we adjust U-Net to use multiple difference images as input. Second, we show that training using all scenes in the CDnet 2014 dataset can improve the generalization power. Hyper-parameters such as the number of difference images and the interval between images in difference image computation are chosen by analyzing experimental results. We demonstrate that the proposed algorithm achieves improved performance in scenes that are not used in training compared to state-of-the-art deep learning and traditional unsupervised algorithms. Diverse experiments using various open datasets and real images show the feasibility of the proposed method.https://www.mdpi.com/2076-3417/11/4/1807visual surveillancedeep learningobject detection
collection DOAJ
language English
format Article
sources DOAJ
author Jae-Yeul Kim
Jong-Eun Ha
spellingShingle Jae-Yeul Kim
Jong-Eun Ha
Foreground Objects Detection by U-Net with Multiple Difference Images
Applied Sciences
visual surveillance
deep learning
object detection
author_facet Jae-Yeul Kim
Jong-Eun Ha
author_sort Jae-Yeul Kim
title Foreground Objects Detection by U-Net with Multiple Difference Images
title_short Foreground Objects Detection by U-Net with Multiple Difference Images
title_full Foreground Objects Detection by U-Net with Multiple Difference Images
title_fullStr Foreground Objects Detection by U-Net with Multiple Difference Images
title_full_unstemmed Foreground Objects Detection by U-Net with Multiple Difference Images
title_sort foreground objects detection by u-net with multiple difference images
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2021-02-01
description In video surveillance, robust detection of foreground objects is usually done by subtracting a background model from the current image. Most traditional approaches use a statistical method to model the background image. Recently, deep learning has also been widely used to detect foreground objects in video surveillance. It shows dramatic improvement compared to the traditional approaches. It is trained through supervised learning, which requires training samples with pixel-level assignment. It requires a huge amount of time and is high cost, while traditional algorithms operate unsupervised and do not require training samples. Additionally, deep learning-based algorithms lack generalization power. They operate well on scenes that are similar to the training conditions, but they do not operate well on scenes that deviate from the training conditions. In this paper, we present a new method to detect foreground objects in video surveillance using multiple difference images as the input of convolutional neural networks, which guarantees improved generalization power compared to current deep learning-based methods. First, we adjust U-Net to use multiple difference images as input. Second, we show that training using all scenes in the CDnet 2014 dataset can improve the generalization power. Hyper-parameters such as the number of difference images and the interval between images in difference image computation are chosen by analyzing experimental results. We demonstrate that the proposed algorithm achieves improved performance in scenes that are not used in training compared to state-of-the-art deep learning and traditional unsupervised algorithms. Diverse experiments using various open datasets and real images show the feasibility of the proposed method.
topic visual surveillance
deep learning
object detection
url https://www.mdpi.com/2076-3417/11/4/1807
work_keys_str_mv AT jaeyeulkim foregroundobjectsdetectionbyunetwithmultipledifferenceimages
AT jongeunha foregroundobjectsdetectionbyunetwithmultipledifferenceimages
_version_ 1724261983505612800