Learning Disentangled Representation of Image and Text Data for E-commerce Products

碩士 === 國立臺灣大學 === 資訊網路與多媒體研究所 === 107 === Deep learning is good at generating distributed representaions, but they cannot be well interpreted. While disentangled representation is a recently discussed concept that features modularity, compactness and explicitness, which is explainable via the genera...

Full description

Bibliographic Details
Main Authors: Zhong-Yu Huang, 黃中余
Other Authors: 林守德
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/pkywcb
id ndltd-TW-107NTU05641030
record_format oai_dc
spelling ndltd-TW-107NTU056410302019-11-16T05:28:00Z http://ndltd.ncl.edu.tw/handle/pkywcb Learning Disentangled Representation of Image and Text Data for E-commerce Products 電商商品圖像與文本的解耦表征之學習 Zhong-Yu Huang 黃中余 碩士 國立臺灣大學 資訊網路與多媒體研究所 107 Deep learning is good at generating distributed representaions, but they cannot be well interpreted. While disentangled representation is a recently discussed concept that features modularity, compactness and explicitness, which is explainable via the generating factors. This thesis makes use of aligned text and image data of E-commerce products to learn a model that can transfrom a product title representation to a disentangled one, which can be divided into two modules, one of them encodes the information that is commonly conveyed by the title and the image, while the other encodes the rest information cannot inferred from the image but only known from the title. We achieve our goal by injecting variational dropout, which also provides us meaningful dropout rates learned from the data. The experiment and evaluation results show that the transformed disentangled representations are good at calculating the similarity between different product titles, meanwhile, different sections of the representation show different patterns when doing the evaluation tasks, which might be useful for more applications. We also show that the properties of disentanlgement can be basically satisfied by our learning methods. 林守德 2019 學位論文 ; thesis 26 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 資訊網路與多媒體研究所 === 107 === Deep learning is good at generating distributed representaions, but they cannot be well interpreted. While disentangled representation is a recently discussed concept that features modularity, compactness and explicitness, which is explainable via the generating factors. This thesis makes use of aligned text and image data of E-commerce products to learn a model that can transfrom a product title representation to a disentangled one, which can be divided into two modules, one of them encodes the information that is commonly conveyed by the title and the image, while the other encodes the rest information cannot inferred from the image but only known from the title. We achieve our goal by injecting variational dropout, which also provides us meaningful dropout rates learned from the data. The experiment and evaluation results show that the transformed disentangled representations are good at calculating the similarity between different product titles, meanwhile, different sections of the representation show different patterns when doing the evaluation tasks, which might be useful for more applications. We also show that the properties of disentanlgement can be basically satisfied by our learning methods.
author2 林守德
author_facet 林守德
Zhong-Yu Huang
黃中余
author Zhong-Yu Huang
黃中余
spellingShingle Zhong-Yu Huang
黃中余
Learning Disentangled Representation of Image and Text Data for E-commerce Products
author_sort Zhong-Yu Huang
title Learning Disentangled Representation of Image and Text Data for E-commerce Products
title_short Learning Disentangled Representation of Image and Text Data for E-commerce Products
title_full Learning Disentangled Representation of Image and Text Data for E-commerce Products
title_fullStr Learning Disentangled Representation of Image and Text Data for E-commerce Products
title_full_unstemmed Learning Disentangled Representation of Image and Text Data for E-commerce Products
title_sort learning disentangled representation of image and text data for e-commerce products
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/pkywcb
work_keys_str_mv AT zhongyuhuang learningdisentangledrepresentationofimageandtextdataforecommerceproducts
AT huángzhōngyú learningdisentangledrepresentationofimageandtextdataforecommerceproducts
AT zhongyuhuang diànshāngshāngpǐntúxiàngyǔwénběndejiěǒubiǎozhēngzhīxuéxí
AT huángzhōngyú diànshāngshāngpǐntúxiàngyǔwénběndejiěǒubiǎozhēngzhīxuéxí
_version_ 1719292830274289664