Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior

We propose a novel method that can learn easy-to-interpret latent representations in real-world image datasets using a VAE-based model by splitting an image into several disjoint regions. Our method performs object-wise disentanglement by exploiting image segmentation and alpha compositing. With rem...

Full description

Bibliographic Details
Main Authors:	Nao Nakagawa, Ren Togo, Takahiro Ogawa, Miki Haseyama
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Alpha blend disentanglement image segmentation real-world image representation learning
Online Access:	https://ieeexplore.ieee.org/document/9502079/

id	doaj-2087e42ef295476c88fc687fd9c69887
record_format	Article
spelling	doaj-2087e42ef295476c88fc687fd9c698872021-08-12T23:00:11ZengIEEEIEEE Access2169-35362021-01-01911088011088810.1109/ACCESS.2021.31012299502079Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation PriorNao Nakagawa0https://orcid.org/0000-0001-9260-0828Ren Togo1https://orcid.org/0000-0002-4474-3995Takahiro Ogawa2https://orcid.org/0000-0001-5332-8112Miki Haseyama3https://orcid.org/0000-0003-1496-1761Graduate School of Information Science and Technology, Hokkaido University, Sapporo, JapanEducation and Research Center for Mathematical and Data Science, Hokkaido University, Sapporo, JapanFaculty of Information Science and Technology, Hokkaido University, Sapporo, JapanFaculty of Information Science and Technology, Hokkaido University, Sapporo, JapanWe propose a novel method that can learn easy-to-interpret latent representations in real-world image datasets using a VAE-based model by splitting an image into several disjoint regions. Our method performs object-wise disentanglement by exploiting image segmentation and alpha compositing. With remarkable results obtained by unsupervised disentanglement methods for toy datasets, recent studies have tackled challenging disentanglement for real-world image datasets. However, these methods involve deviations from the standard VAE architecture, which has favorable disentanglement properties. Thus, for disentanglement in images of real-world image datasets with preservation of the VAE backbone, we designed an encoder and a decoder that embed an image into disjoint sets of latent variables corresponding to objects. The encoder includes a pre-trained image segmentation network, which allows our model to focus only on representation learning while adopting image segmentation as an inductive bias. Evaluations using real-world image datasets, CelebA and Stanford Cars, showed that our method achieves improved disentanglement and transferability.https://ieeexplore.ieee.org/document/9502079/Alpha blenddisentanglementimage segmentationreal-world imagerepresentation learning
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Nao Nakagawa Ren Togo Takahiro Ogawa Miki Haseyama
spellingShingle	Nao Nakagawa Ren Togo Takahiro Ogawa Miki Haseyama Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior IEEE Access Alpha blend disentanglement image segmentation real-world image representation learning
author_facet	Nao Nakagawa Ren Togo Takahiro Ogawa Miki Haseyama
author_sort	Nao Nakagawa
title	Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior
title_short	Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior
title_full	Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior
title_fullStr	Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior
title_full_unstemmed	Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior
title_sort	disentangled representation learning in real-world image datasets via image segmentation prior
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2021-01-01
description	We propose a novel method that can learn easy-to-interpret latent representations in real-world image datasets using a VAE-based model by splitting an image into several disjoint regions. Our method performs object-wise disentanglement by exploiting image segmentation and alpha compositing. With remarkable results obtained by unsupervised disentanglement methods for toy datasets, recent studies have tackled challenging disentanglement for real-world image datasets. However, these methods involve deviations from the standard VAE architecture, which has favorable disentanglement properties. Thus, for disentanglement in images of real-world image datasets with preservation of the VAE backbone, we designed an encoder and a decoder that embed an image into disjoint sets of latent variables corresponding to objects. The encoder includes a pre-trained image segmentation network, which allows our model to focus only on representation learning while adopting image segmentation as an inductive bias. Evaluations using real-world image datasets, CelebA and Stanford Cars, showed that our method achieves improved disentanglement and transferability.
topic	Alpha blend disentanglement image segmentation real-world image representation learning
url	https://ieeexplore.ieee.org/document/9502079/
work_keys_str_mv	AT naonakagawa disentangledrepresentationlearninginrealworldimagedatasetsviaimagesegmentationprior AT rentogo disentangledrepresentationlearninginrealworldimagedatasetsviaimagesegmentationprior AT takahiroogawa disentangledrepresentationlearninginrealworldimagedatasetsviaimagesegmentationprior AT mikihaseyama disentangledrepresentationlearninginrealworldimagedatasetsviaimagesegmentationprior
_version_	1721209160640495616

Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior

Similar Items