Fast Algorithms for Block Matching, Nearest Neighbor Search, and DNA Sequence Search

博士 === 國立臺灣大學 === 資訊工程學研究所 === 89 === Template matching has been widely used in image and video compression, visual tracking, stereo vision, pattern classification, object recognition, and information retrieval in database systems. Among the major difficulties of template matching is its...

Full description

Bibliographic Details
Main Authors: Yong-Sheng Chen, 陳永昇
Other Authors: Yi-Ping Hung
Format: Others
Language:en_US
Published: 2001
Online Access:http://ndltd.ncl.edu.tw/handle/14512640917239947049
id ndltd-TW-089NTU00392011
record_format oai_dc
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立臺灣大學 === 資訊工程學研究所 === 89 === Template matching has been widely used in image and video compression, visual tracking, stereo vision, pattern classification, object recognition, and information retrieval in database systems. Among the major difficulties of template matching is its high computational cost when dealing with large amount of data. In this thesis we propose techniques that can greatly improve the computational efficiency of template matching while still guaranteeing the optimal search. These techniques are applied to speed up the applications of block matching, nearest neighbor search, and DNA sequence database search. The key idea of how we speed up the template matching process is the utilization of distance lower bounds. Our goal is to find in a search range the object yielding the minimum distance to the query object. Therefore, calculation of a distance can be skipped if any of its lower bound is larger than the global minimum distance. Since the computation of the distance lower bound utilized in this work costs less than that of the distance itself, the overall process can be accelerated. Moreover, the winner-update search strategy is used to reduce the number of distance lower bounds actually calculated. Several data transformation techniques are also adopted to tighten the distance lower bounds. Thus further speedup is achieved. For the block matching application in video compression and visual tracking, we propose a new fast algorithm based on the winner-update search strategy which utilizes an ascending lower bound list of the matching error to determine the winner. At each search position, the costly computation of matching error can be avoided when there exists a lower bound larger than the global minimum matching error. The proposed algorithm can significantly speed up the computation of the block matching because (1) computational cost of the lower bound we use is less than that of the matching error itself; (2) an element in the ascending lower bound list will be calculated only when its preceding element has already been smaller than the minimum matching error computed so far; and (3) for many search positions, only the first several lower bounds in the list need to be calculated. Our experiments have shown that, when applying to motion vector estimation for several widely-used test videos, 92% to 98% of operations can be saved. Moreover, we apply the proposed block matching algorithm to a video-based face/eye tracking system. In our experiments, the face and eye positions of the user can be obtained at the video frame rate. We also propose in this thesis a fast and versatile algorithm which can perform a variety of nearest neighbor searches very efficiently. At the preprocessing stage, the proposed algorithm constructs a lower bound tree (LB-tree) by agglomeratively clustering all the sample points. Given a query point, the lower bound of its distance to each sample point can be calculated by using the internal node of the LB-tree. Calculations of distances from the query point to many sample points can be avoided if their less expensive lower bounds are larger than the minimum distance. To reduce the amount of lower bounds actually calculated, the winner-update search strategy is used for tree traversal. For further efficiency improvement, data transformation can be applied to the sample and query points. In addition to finding the nearest neighbor, the proposed algorithm can also (i) provide the k-nearest neighbors progressively; (ii) find the nearest neighbors within a specified distance threshold; and (iii) identify neighbors close to the nearest neighbor. Our experiments have shown that the proposed algorithm can save substantial computation, particularly when the distance of the query point to its nearest neighbor is relatively very small compared with its distance to most other samples (which is the case for many object recognition problems). When applied to the real database used in Nayar's 100 object recognition system, the proposed algorithm is about one thousand times faster than the exhaustive search. This performance is roughly eighteen times faster than the result attained by Nene and Nayar, whose method is by far the best method we know. In the application of DNA sequence database search, our goal is to find all the sequence segments in the database that are similar enough (compared to a threshold value) to the query sequence. We propose in this thesis a string-to-signal transform technique which can transform a DNA sequence into multi-channel signals. Without considering gaps, the similar score between two DNA sequences can be calculated as the sum of absolute difference between their corresponding signals. Fast template matching techniques presented in this thesis can then be applied to greatly speed up the search process. Moreover, these techniques guarantee the optimal search. That is, all the sequence segments that are similar enough to the query sequence can be found without any miss.
author2 Yi-Ping Hung
author_facet Yi-Ping Hung
Yong-Sheng Chen
陳永昇
author Yong-Sheng Chen
陳永昇
spellingShingle Yong-Sheng Chen
陳永昇
Fast Algorithms for Block Matching, Nearest Neighbor Search, and DNA Sequence Search
author_sort Yong-Sheng Chen
title Fast Algorithms for Block Matching, Nearest Neighbor Search, and DNA Sequence Search
title_short Fast Algorithms for Block Matching, Nearest Neighbor Search, and DNA Sequence Search
title_full Fast Algorithms for Block Matching, Nearest Neighbor Search, and DNA Sequence Search
title_fullStr Fast Algorithms for Block Matching, Nearest Neighbor Search, and DNA Sequence Search
title_full_unstemmed Fast Algorithms for Block Matching, Nearest Neighbor Search, and DNA Sequence Search
title_sort fast algorithms for block matching, nearest neighbor search, and dna sequence search
publishDate 2001
url http://ndltd.ncl.edu.tw/handle/14512640917239947049
work_keys_str_mv AT yongshengchen fastalgorithmsforblockmatchingnearestneighborsearchanddnasequencesearch
AT chényǒngshēng fastalgorithmsforblockmatchingnearestneighborsearchanddnasequencesearch
AT yongshengchen qūkuàiduìyīngzuìjìnlínjūsōuxúnyǔdnaxùlièsōuxúnzhīkuàisùyǎnsuànfǎ
AT chényǒngshēng qūkuàiduìyīngzuìjìnlínjūsōuxúnyǔdnaxùlièsōuxúnzhīkuàisùyǎnsuànfǎ
_version_ 1718333853538451456
spelling ndltd-TW-089NTU003920112016-07-04T04:17:05Z http://ndltd.ncl.edu.tw/handle/14512640917239947049 Fast Algorithms for Block Matching, Nearest Neighbor Search, and DNA Sequence Search 區塊對應、最近鄰居搜尋、與DNA序列搜尋之快速演算法 Yong-Sheng Chen 陳永昇 博士 國立臺灣大學 資訊工程學研究所 89 Template matching has been widely used in image and video compression, visual tracking, stereo vision, pattern classification, object recognition, and information retrieval in database systems. Among the major difficulties of template matching is its high computational cost when dealing with large amount of data. In this thesis we propose techniques that can greatly improve the computational efficiency of template matching while still guaranteeing the optimal search. These techniques are applied to speed up the applications of block matching, nearest neighbor search, and DNA sequence database search. The key idea of how we speed up the template matching process is the utilization of distance lower bounds. Our goal is to find in a search range the object yielding the minimum distance to the query object. Therefore, calculation of a distance can be skipped if any of its lower bound is larger than the global minimum distance. Since the computation of the distance lower bound utilized in this work costs less than that of the distance itself, the overall process can be accelerated. Moreover, the winner-update search strategy is used to reduce the number of distance lower bounds actually calculated. Several data transformation techniques are also adopted to tighten the distance lower bounds. Thus further speedup is achieved. For the block matching application in video compression and visual tracking, we propose a new fast algorithm based on the winner-update search strategy which utilizes an ascending lower bound list of the matching error to determine the winner. At each search position, the costly computation of matching error can be avoided when there exists a lower bound larger than the global minimum matching error. The proposed algorithm can significantly speed up the computation of the block matching because (1) computational cost of the lower bound we use is less than that of the matching error itself; (2) an element in the ascending lower bound list will be calculated only when its preceding element has already been smaller than the minimum matching error computed so far; and (3) for many search positions, only the first several lower bounds in the list need to be calculated. Our experiments have shown that, when applying to motion vector estimation for several widely-used test videos, 92% to 98% of operations can be saved. Moreover, we apply the proposed block matching algorithm to a video-based face/eye tracking system. In our experiments, the face and eye positions of the user can be obtained at the video frame rate. We also propose in this thesis a fast and versatile algorithm which can perform a variety of nearest neighbor searches very efficiently. At the preprocessing stage, the proposed algorithm constructs a lower bound tree (LB-tree) by agglomeratively clustering all the sample points. Given a query point, the lower bound of its distance to each sample point can be calculated by using the internal node of the LB-tree. Calculations of distances from the query point to many sample points can be avoided if their less expensive lower bounds are larger than the minimum distance. To reduce the amount of lower bounds actually calculated, the winner-update search strategy is used for tree traversal. For further efficiency improvement, data transformation can be applied to the sample and query points. In addition to finding the nearest neighbor, the proposed algorithm can also (i) provide the k-nearest neighbors progressively; (ii) find the nearest neighbors within a specified distance threshold; and (iii) identify neighbors close to the nearest neighbor. Our experiments have shown that the proposed algorithm can save substantial computation, particularly when the distance of the query point to its nearest neighbor is relatively very small compared with its distance to most other samples (which is the case for many object recognition problems). When applied to the real database used in Nayar's 100 object recognition system, the proposed algorithm is about one thousand times faster than the exhaustive search. This performance is roughly eighteen times faster than the result attained by Nene and Nayar, whose method is by far the best method we know. In the application of DNA sequence database search, our goal is to find all the sequence segments in the database that are similar enough (compared to a threshold value) to the query sequence. We propose in this thesis a string-to-signal transform technique which can transform a DNA sequence into multi-channel signals. Without considering gaps, the similar score between two DNA sequences can be calculated as the sum of absolute difference between their corresponding signals. Fast template matching techniques presented in this thesis can then be applied to greatly speed up the search process. Moreover, these techniques guarantee the optimal search. That is, all the sequence segments that are similar enough to the query sequence can be found without any miss. Yi-Ping Hung Chiou-Shann Fuh 洪一平 傅楸善 2001 學位論文 ; thesis 156 en_US