A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism

碩士 === 國立臺灣科技大學 === 電機工程系 === 107 === Semantic segmentation technology is a common issue in computer vision and deep learning fields. Differs from image classification and object detection tasks, semantic segmentation technology assigns a single category label to all pixels, thereby getting a better...

Full description

Bibliographic Details
Main Authors: Hao-Hsuan Lee, 李皓軒
Other Authors: Jing-Ming Guo
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/f5uh34
id ndltd-TW-107NTUS5442128
record_format oai_dc
spelling ndltd-TW-107NTUS54421282019-10-24T05:20:29Z http://ndltd.ncl.edu.tw/handle/f5uh34 A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism 基於多任務學習與邊緣監督機制於文件頁面語義分割網路之應用 Hao-Hsuan Lee 李皓軒 碩士 國立臺灣科技大學 電機工程系 107 Semantic segmentation technology is a common issue in computer vision and deep learning fields. Differs from image classification and object detection tasks, semantic segmentation technology assigns a single category label to all pixels, thereby getting a better scene understanding about the data. Consequently, it normally plays an essential role in high-level applications such as medical image retrieval, satellite imagery analytics, autonomous driving and document page segmentation to name but a few. The semantic segmentation network has achieved stable and excellent performance in the natural image segmentation task in recent years. However, due to the huge difference in the feature structure and object category between the document image and the natural image, the existing semantic segmentation network still has room for improvement and progress in the document page segmentation field. For example, since the object type of the document image is less than the natural image and the object area is mostly large, the edge can be expected as a useful information that can assist the network learning process. Based on existing and stable semantic segmentation network, this paper utilizes the edge feature information to assist the network for feature learning, expected to improve the performance near the object edge. We also proposed the Densely Joint Pyramid Module, which enhances the feature extraction part to get multi-scale and dense feature extraction. As a result, it improves the overall performance in the document page segmentation field. Jing-Ming Guo 郭景明 2019 學位論文 ; thesis 106 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣科技大學 === 電機工程系 === 107 === Semantic segmentation technology is a common issue in computer vision and deep learning fields. Differs from image classification and object detection tasks, semantic segmentation technology assigns a single category label to all pixels, thereby getting a better scene understanding about the data. Consequently, it normally plays an essential role in high-level applications such as medical image retrieval, satellite imagery analytics, autonomous driving and document page segmentation to name but a few. The semantic segmentation network has achieved stable and excellent performance in the natural image segmentation task in recent years. However, due to the huge difference in the feature structure and object category between the document image and the natural image, the existing semantic segmentation network still has room for improvement and progress in the document page segmentation field. For example, since the object type of the document image is less than the natural image and the object area is mostly large, the edge can be expected as a useful information that can assist the network learning process. Based on existing and stable semantic segmentation network, this paper utilizes the edge feature information to assist the network for feature learning, expected to improve the performance near the object edge. We also proposed the Densely Joint Pyramid Module, which enhances the feature extraction part to get multi-scale and dense feature extraction. As a result, it improves the overall performance in the document page segmentation field.
author2 Jing-Ming Guo
author_facet Jing-Ming Guo
Hao-Hsuan Lee
李皓軒
author Hao-Hsuan Lee
李皓軒
spellingShingle Hao-Hsuan Lee
李皓軒
A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism
author_sort Hao-Hsuan Lee
title A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism
title_short A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism
title_full A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism
title_fullStr A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism
title_full_unstemmed A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism
title_sort deep document semantic segmentation network with edge supervision and multi-task learning mechanism
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/f5uh34
work_keys_str_mv AT haohsuanlee adeepdocumentsemanticsegmentationnetworkwithedgesupervisionandmultitasklearningmechanism
AT lǐhàoxuān adeepdocumentsemanticsegmentationnetworkwithedgesupervisionandmultitasklearningmechanism
AT haohsuanlee jīyúduōrènwùxuéxíyǔbiānyuánjiāndūjīzhìyúwénjiànyèmiànyǔyìfēngēwǎnglùzhīyīngyòng
AT lǐhàoxuān jīyúduōrènwùxuéxíyǔbiānyuánjiāndūjīzhìyúwénjiànyèmiànyǔyìfēngēwǎnglùzhīyīngyòng
AT haohsuanlee deepdocumentsemanticsegmentationnetworkwithedgesupervisionandmultitasklearningmechanism
AT lǐhàoxuān deepdocumentsemanticsegmentationnetworkwithedgesupervisionandmultitasklearningmechanism
_version_ 1719277535958663168