Multimedia Data Mining Techniques for Semantic Annotation, Retrieval and Recommendation

博士 === 國立成功大學 === 資訊工程學系碩博士班 === 98 === In recent years, the advance of digital capturing technologies lead to the rapid growth of multimedia data in various formats, such as image, music, video and so on. Moreover, the modern telecommunication systems make multimedia data widespread and extremely l...

Full description

Bibliographic Details
Main Authors: Ja-HwungSu, 蘇家輝
Other Authors: Vincent S. Tseng
Format: Others
Language:en_US
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/05323447331505634288
id ndltd-TW-098NCKU5392008
record_format oai_dc
spelling ndltd-TW-098NCKU53920082015-10-13T18:25:53Z http://ndltd.ncl.edu.tw/handle/05323447331505634288 Multimedia Data Mining Techniques for Semantic Annotation, Retrieval and Recommendation 多媒體語意註解、擷取與推薦之資料探勘技術 Ja-HwungSu 蘇家輝 博士 國立成功大學 資訊工程學系碩博士班 98 In recent years, the advance of digital capturing technologies lead to the rapid growth of multimedia data in various formats, such as image, music, video and so on. Moreover, the modern telecommunication systems make multimedia data widespread and extremely large. Hence, how to conceptualize, retrieve and recommend the multimedia data from such massive multimedia data resources has been becoming an attractive and challenging issue over the past few years. To deal with this issue, the primary aim of this dissertation is to develop effective multimedia data mining techniques for discovering the valuable knowledge from multimedia data, so as to achieve the high quality of multimedia annotation, retrieval and recommendation. Nowadays, a considerable number of studies in the field of multimedia annotations incur the difficulties of diverse relationships between human concepts and visual contents, namely diverse visual-concept associations. So-called visual-concept diversity indicates that, a set of different concepts share with very similar visual features. To alleviate the problems of diverse visual-concept associations, this dissertation presents the integrated mining of visual, speech and text features for semantic image/video annotation. For image annotation, we propose a visual-based annotation method to disambiguate the image sense while a number of senses are shared by a number of images. Additionally, a textual-based annotation method, which attempts to discover the affinities of image captions and web-page keywords, is also proposed to attack the lack of visual-based annotations. For video annotation, with considering the temporal continuity, the frequent visual, textual and visual-textual patterns can be mined to support semantic video annotation by proposed video annotation models. Based on the image annotation, the user’s interest and visual images can be bridged semantically for further textual-based image retrieval. However, little work has highlighted the conceptual retrieval from textual annotations to visual images in the last few years. To this end, the second intention in this dissertation is to retrieve the images by proposed image annotation, concept matching and fuzzy ranking techniques. In addition to textual-based image retrieval, the textual-based video retrieval cannot earn the user’s satisfaction either due to the problems of diverse query concepts. To supplement the weakness of textual-based video retrieval, we propose an innovative method to mine the temporal patterns from the video contents for supporting content-based video retrieval. On the basis of discovered temporal visual patterns, an efficient indexing technique and an effective sequence matching technique are integrated to reduce the computation cost and to raise the retrieval accuracy, respectively. In contrast to passive image/video retrieval, music recommendation is the final concentration in this dissertation to actively provide the users with the preferred music pieces. In this work, we design a novel music recommender that integrates music content mining and collaborative filtering to help the users find what she/he prefers from a huge amount of music collections. By discovering preferable perceptual-patterns from music pieces, the user’s listening interest and music can be associated effectively. Also the traditional rating diversity problem can be alleviated. For each proposed approach above, the experimental results in this dissertation reveal that, our proposed multimedia data mining methods are beneficial for better multimedia annotation, retrieval and recommendation so as to apply to some real multimedia applications, such as mobile multimedia retrieval and recommendation. Vincent S. Tseng 曾新穆 2010 學位論文 ; thesis 139 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立成功大學 === 資訊工程學系碩博士班 === 98 === In recent years, the advance of digital capturing technologies lead to the rapid growth of multimedia data in various formats, such as image, music, video and so on. Moreover, the modern telecommunication systems make multimedia data widespread and extremely large. Hence, how to conceptualize, retrieve and recommend the multimedia data from such massive multimedia data resources has been becoming an attractive and challenging issue over the past few years. To deal with this issue, the primary aim of this dissertation is to develop effective multimedia data mining techniques for discovering the valuable knowledge from multimedia data, so as to achieve the high quality of multimedia annotation, retrieval and recommendation. Nowadays, a considerable number of studies in the field of multimedia annotations incur the difficulties of diverse relationships between human concepts and visual contents, namely diverse visual-concept associations. So-called visual-concept diversity indicates that, a set of different concepts share with very similar visual features. To alleviate the problems of diverse visual-concept associations, this dissertation presents the integrated mining of visual, speech and text features for semantic image/video annotation. For image annotation, we propose a visual-based annotation method to disambiguate the image sense while a number of senses are shared by a number of images. Additionally, a textual-based annotation method, which attempts to discover the affinities of image captions and web-page keywords, is also proposed to attack the lack of visual-based annotations. For video annotation, with considering the temporal continuity, the frequent visual, textual and visual-textual patterns can be mined to support semantic video annotation by proposed video annotation models. Based on the image annotation, the user’s interest and visual images can be bridged semantically for further textual-based image retrieval. However, little work has highlighted the conceptual retrieval from textual annotations to visual images in the last few years. To this end, the second intention in this dissertation is to retrieve the images by proposed image annotation, concept matching and fuzzy ranking techniques. In addition to textual-based image retrieval, the textual-based video retrieval cannot earn the user’s satisfaction either due to the problems of diverse query concepts. To supplement the weakness of textual-based video retrieval, we propose an innovative method to mine the temporal patterns from the video contents for supporting content-based video retrieval. On the basis of discovered temporal visual patterns, an efficient indexing technique and an effective sequence matching technique are integrated to reduce the computation cost and to raise the retrieval accuracy, respectively. In contrast to passive image/video retrieval, music recommendation is the final concentration in this dissertation to actively provide the users with the preferred music pieces. In this work, we design a novel music recommender that integrates music content mining and collaborative filtering to help the users find what she/he prefers from a huge amount of music collections. By discovering preferable perceptual-patterns from music pieces, the user’s listening interest and music can be associated effectively. Also the traditional rating diversity problem can be alleviated. For each proposed approach above, the experimental results in this dissertation reveal that, our proposed multimedia data mining methods are beneficial for better multimedia annotation, retrieval and recommendation so as to apply to some real multimedia applications, such as mobile multimedia retrieval and recommendation.
author2 Vincent S. Tseng
author_facet Vincent S. Tseng
Ja-HwungSu
蘇家輝
author Ja-HwungSu
蘇家輝
spellingShingle Ja-HwungSu
蘇家輝
Multimedia Data Mining Techniques for Semantic Annotation, Retrieval and Recommendation
author_sort Ja-HwungSu
title Multimedia Data Mining Techniques for Semantic Annotation, Retrieval and Recommendation
title_short Multimedia Data Mining Techniques for Semantic Annotation, Retrieval and Recommendation
title_full Multimedia Data Mining Techniques for Semantic Annotation, Retrieval and Recommendation
title_fullStr Multimedia Data Mining Techniques for Semantic Annotation, Retrieval and Recommendation
title_full_unstemmed Multimedia Data Mining Techniques for Semantic Annotation, Retrieval and Recommendation
title_sort multimedia data mining techniques for semantic annotation, retrieval and recommendation
publishDate 2010
url http://ndltd.ncl.edu.tw/handle/05323447331505634288
work_keys_str_mv AT jahwungsu multimediadataminingtechniquesforsemanticannotationretrievalandrecommendation
AT sūjiāhuī multimediadataminingtechniquesforsemanticannotationretrievalandrecommendation
AT jahwungsu duōméitǐyǔyìzhùjiěxiéqǔyǔtuījiànzhīzīliàotànkānjìshù
AT sūjiāhuī duōméitǐyǔyìzhùjiěxiéqǔyǔtuījiànzhīzīliàotànkānjìshù
_version_ 1718033537305673728