A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism
碩士 === 國立臺灣科技大學 === 電機工程系 === 107 === Semantic segmentation technology is a common issue in computer vision and deep learning fields. Differs from image classification and object detection tasks, semantic segmentation technology assigns a single category label to all pixels, thereby getting a better...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2019
|
Online Access: | http://ndltd.ncl.edu.tw/handle/f5uh34 |
id |
ndltd-TW-107NTUS5442128 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-107NTUS54421282019-10-24T05:20:29Z http://ndltd.ncl.edu.tw/handle/f5uh34 A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism 基於多任務學習與邊緣監督機制於文件頁面語義分割網路之應用 Hao-Hsuan Lee 李皓軒 碩士 國立臺灣科技大學 電機工程系 107 Semantic segmentation technology is a common issue in computer vision and deep learning fields. Differs from image classification and object detection tasks, semantic segmentation technology assigns a single category label to all pixels, thereby getting a better scene understanding about the data. Consequently, it normally plays an essential role in high-level applications such as medical image retrieval, satellite imagery analytics, autonomous driving and document page segmentation to name but a few. The semantic segmentation network has achieved stable and excellent performance in the natural image segmentation task in recent years. However, due to the huge difference in the feature structure and object category between the document image and the natural image, the existing semantic segmentation network still has room for improvement and progress in the document page segmentation field. For example, since the object type of the document image is less than the natural image and the object area is mostly large, the edge can be expected as a useful information that can assist the network learning process. Based on existing and stable semantic segmentation network, this paper utilizes the edge feature information to assist the network for feature learning, expected to improve the performance near the object edge. We also proposed the Densely Joint Pyramid Module, which enhances the feature extraction part to get multi-scale and dense feature extraction. As a result, it improves the overall performance in the document page segmentation field. Jing-Ming Guo 郭景明 2019 學位論文 ; thesis 106 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣科技大學 === 電機工程系 === 107 === Semantic segmentation technology is a common issue in computer vision and deep learning fields. Differs from image classification and object detection tasks, semantic segmentation technology assigns a single category label to all pixels, thereby getting a better scene understanding about the data. Consequently, it normally plays an essential role in high-level applications such as medical image retrieval, satellite imagery analytics, autonomous driving and document page segmentation to name but a few.
The semantic segmentation network has achieved stable and excellent performance in the natural image segmentation task in recent years. However, due to the huge difference in the feature structure and object category between the document image and the natural image, the existing semantic segmentation network still has room for improvement and progress in the document page segmentation field. For example, since the object type of the document image is less than the natural image and the object area is mostly large, the edge can be expected as a useful information that can assist the network learning process.
Based on existing and stable semantic segmentation network, this paper utilizes the edge feature information to assist the network for feature learning, expected to improve the performance near the object edge. We also proposed the Densely Joint Pyramid Module, which enhances the feature extraction part to get multi-scale and dense feature extraction. As a result, it improves the overall performance in the document page segmentation field.
|
author2 |
Jing-Ming Guo |
author_facet |
Jing-Ming Guo Hao-Hsuan Lee 李皓軒 |
author |
Hao-Hsuan Lee 李皓軒 |
spellingShingle |
Hao-Hsuan Lee 李皓軒 A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism |
author_sort |
Hao-Hsuan Lee |
title |
A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism |
title_short |
A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism |
title_full |
A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism |
title_fullStr |
A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism |
title_full_unstemmed |
A Deep Document Semantic Segmentation Network with Edge Supervision and Multi-Task Learning Mechanism |
title_sort |
deep document semantic segmentation network with edge supervision and multi-task learning mechanism |
publishDate |
2019 |
url |
http://ndltd.ncl.edu.tw/handle/f5uh34 |
work_keys_str_mv |
AT haohsuanlee adeepdocumentsemanticsegmentationnetworkwithedgesupervisionandmultitasklearningmechanism AT lǐhàoxuān adeepdocumentsemanticsegmentationnetworkwithedgesupervisionandmultitasklearningmechanism AT haohsuanlee jīyúduōrènwùxuéxíyǔbiānyuánjiāndūjīzhìyúwénjiànyèmiànyǔyìfēngēwǎnglùzhīyīngyòng AT lǐhàoxuān jīyúduōrènwùxuéxíyǔbiānyuánjiāndūjīzhìyúwénjiànyèmiànyǔyìfēngēwǎnglùzhīyīngyòng AT haohsuanlee deepdocumentsemanticsegmentationnetworkwithedgesupervisionandmultitasklearningmechanism AT lǐhàoxuān deepdocumentsemanticsegmentationnetworkwithedgesupervisionandmultitasklearningmechanism |
_version_ |
1719277535958663168 |