Multi-Style Unsupervised Image Synthesis Using Generative Adversarial Nets

Unsupervised cross-domain image-to-image translation is a very active topic in computer vision and graphics. This task has two challenges: 1) lack of paired training data and 2) numerous possible outputs from a single image. The existing methods rely on either paired data or perform one-to-one trans...

Full description

Bibliographic Details
Main Authors: Guoyun Lv, Syed Muhammad Israr, Shengyong Qi
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9448164/
Description
Summary:Unsupervised cross-domain image-to-image translation is a very active topic in computer vision and graphics. This task has two challenges: 1) lack of paired training data and 2) numerous possible outputs from a single image. The existing methods rely on either paired data or perform one-to-one translation. A novel Multi-Style Unsupervised image synthesis model using Generative Adversarial Nets (MSU-GAN) is proposed in this paper to overcome these disadvantages. Firstly, the encoder-decoder structure is used to map the image to domain-shared content features space and domain-specific style features space. Secondly, to translate an image into another domain, the content code and the style code are combined to synthesize the resulting image. Finally, the bidirectional cycle-consistency loss is used for the unpaired training data; the inter-domain adversarial loss and the reconstruction loss are used to ensure the output image’s realism. Simultaneously, MSU-GAN is able to synthesize multi-style images due to disentangled representation. A Multi-Style Unsupervised Feature-Wise image synthesis model using Generative Adversarial Nets (MSU-FW-GAN) based on the MSU-GAN is proposed for the shape variation tasks. There are two different testing strategies, which include random style transfer and style guide transfer. For objective comparison, the proposed model performs well on all evaluation metrics. The random style transfer experiment results show that compared with CycleGAN on the photo2portraits dataset, MSU-FW-GAN FID, IS scores dropped by 12.77% and 8.06%. For the summer2winter dataset, MSU-GAN FID and IS scores increased by 24.51% and 3.64%. Qualitative results show that without paired training data, MSU-GAN and MSU-FW-GAN can synthesize multi-style and better realistic images on various tasks.
ISSN:2169-3536