Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media

博士 === 國立臺灣大學 === 資訊網路與多媒體研究所 === 104 === Social media have changed the world and our lives. Every day, millions of media data are uploaded to social-sharing websites. The goal of the research is to discover and summarize large amounts of media data from the emerging social media into information...

Full description

Bibliographic Details
Main Authors: Wen-Yu Lee, 李文瑜
Other Authors: 徐宏民
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/98114821102121233466
id ndltd-TW-104NTU05641016
record_format oai_dc
spelling ndltd-TW-104NTU056410162017-04-16T04:35:13Z http://ndltd.ncl.edu.tw/handle/98114821102121233466 Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media 以複合式模型學習法探究多社群網路媒體之使用者資訊 Wen-Yu Lee 李文瑜 博士 國立臺灣大學 資訊網路與多媒體研究所 104 Social media have changed the world and our lives. Every day, millions of media data are uploaded to social-sharing websites. The goal of the research is to discover and summarize large amounts of media data from the emerging social media into information of interests. Our basic idea is to perform multi-modal learning for given data, leveraging user-contributed data from cross-domain social media. Specifically, given a photo, we intend to discover geographical information, people''s description or comments, and events of interest, closely related to the photo. These information then can be used for various purposes, such as being a real-time guide for the tourists to improve the quality of tourism. As a result, this dissertation studies modern challenges of image location identification, image annotation, and event discovery, followed by presenting promising ways to conquer the challenges. For image location identification, most previous works directly integrated visual features and geo-tags of the given photos. The performance of the existing approaches, however, could be limited if the given photos were taken indoors, and/or their image contents contain a number of buildings in a close proximity. As a solution, this dissertation unifies visual features, geo-tags, and check-in data, and further presents an image cluster refinement approach, for image location identification. For image annotation, label propagation is widely used to annotate photos based on similarity graphs of photos, where most previous works focused on single-label propagation. Although performing multi-label propagation is expected to be more efficient for annotation than performing single-label propagation several times, performing multi-label propagation may increase the computational complexities. Further, sizes of image datasets continue to increase and thus increase the problem complexity. As a solution, this dissertation presents a scalable multi-label propagation leveraging the power of distributed computing. For event discovery, most previous works investigated a specific media stream. Potentially, mining multiple media streams is capable of achieving better performance than mining a media stream alone, but could be more challenging. As a solution, this dissertation presents a two-stage framework that combines a flow-based media dataset and check-in-based media dataset for events-of-interest discovery. Experimental results on real media datasets show the effectiveness of all of the proposed approaches. Finally, this dissertation provides some possible directions for future studies. 徐宏民 2016 學位論文 ; thesis 90 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立臺灣大學 === 資訊網路與多媒體研究所 === 104 === Social media have changed the world and our lives. Every day, millions of media data are uploaded to social-sharing websites. The goal of the research is to discover and summarize large amounts of media data from the emerging social media into information of interests. Our basic idea is to perform multi-modal learning for given data, leveraging user-contributed data from cross-domain social media. Specifically, given a photo, we intend to discover geographical information, people''s description or comments, and events of interest, closely related to the photo. These information then can be used for various purposes, such as being a real-time guide for the tourists to improve the quality of tourism. As a result, this dissertation studies modern challenges of image location identification, image annotation, and event discovery, followed by presenting promising ways to conquer the challenges. For image location identification, most previous works directly integrated visual features and geo-tags of the given photos. The performance of the existing approaches, however, could be limited if the given photos were taken indoors, and/or their image contents contain a number of buildings in a close proximity. As a solution, this dissertation unifies visual features, geo-tags, and check-in data, and further presents an image cluster refinement approach, for image location identification. For image annotation, label propagation is widely used to annotate photos based on similarity graphs of photos, where most previous works focused on single-label propagation. Although performing multi-label propagation is expected to be more efficient for annotation than performing single-label propagation several times, performing multi-label propagation may increase the computational complexities. Further, sizes of image datasets continue to increase and thus increase the problem complexity. As a solution, this dissertation presents a scalable multi-label propagation leveraging the power of distributed computing. For event discovery, most previous works investigated a specific media stream. Potentially, mining multiple media streams is capable of achieving better performance than mining a media stream alone, but could be more challenging. As a solution, this dissertation presents a two-stage framework that combines a flow-based media dataset and check-in-based media dataset for events-of-interest discovery. Experimental results on real media datasets show the effectiveness of all of the proposed approaches. Finally, this dissertation provides some possible directions for future studies.
author2 徐宏民
author_facet 徐宏民
Wen-Yu Lee
李文瑜
author Wen-Yu Lee
李文瑜
spellingShingle Wen-Yu Lee
李文瑜
Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media
author_sort Wen-Yu Lee
title Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media
title_short Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media
title_full Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media
title_fullStr Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media
title_full_unstemmed Multi-Modal Learning over User-Contributed Content from Cross-Domain Social Media
title_sort multi-modal learning over user-contributed content from cross-domain social media
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/98114821102121233466
work_keys_str_mv AT wenyulee multimodallearningoverusercontributedcontentfromcrossdomainsocialmedia
AT lǐwényú multimodallearningoverusercontributedcontentfromcrossdomainsocialmedia
AT wenyulee yǐfùhéshìmóxíngxuéxífǎtànjiūduōshèqúnwǎnglùméitǐzhīshǐyòngzhězīxùn
AT lǐwényú yǐfùhéshìmóxíngxuéxífǎtànjiūduōshèqúnwǎnglùméitǐzhīshǐyòngzhězīxùn
_version_ 1718439181071417344