Simplex Geometry Based Non-negative Blind Source Separation

博士 === 國立清華大學 === 通訊工程研究所 === 104 === Non-negative blind source separation (nBSS), the focus of this dissertation, has found many successful applications in science and engineering, such as biomedical imaging, gene expression data analysis, and hyperspectral imaging in remote sensing. In contrast to...

Full description

Bibliographic Details
Main Authors: Lin, Chia-Hsiang, 林家祥
Other Authors: Chi, Chong-Yung
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/19118106298842310726
id ndltd-TW-104NTHU5650101
record_format oai_dc
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立清華大學 === 通訊工程研究所 === 104 === Non-negative blind source separation (nBSS), the focus of this dissertation, has found many successful applications in science and engineering, such as biomedical imaging, gene expression data analysis, and hyperspectral imaging in remote sensing. In contrast to conventional nBSS methods, including non-negative independent component analysis (nICA) and non-negative matrix factorization (NMF), we consider the nBSS problem from the perspective of simplex geometry without requiring sources' statistical independence and existence of pure pixel (fully contributed by a single source). The columns of the mixing matrix, describing how the non-negative sources are mixed, can be estimated by the vertices (also referred to as endmembers) of the minimum-volume simplex that encloses all pixel vectors---the well-known Craig's nBSS criterion. Empirical experience has suggested that Craig's criterion is capable of unmixing heavily mixed sources, but it was not clear why this is true from a theoretical viewpoint. Before we adopt this powerful criterion for devising a highly efficient and effective nBSS algorithm, we develop an analysis framework wherein the source mixing level (or data purity level) is quantitatively defined, and prove that Craig's criterion indeed can yield perfect endmember identifiability (in the noiseless scenario) as long as this quantity is greater than a {certain} small threshold. Our theoretical results are substantiated by numerical simulation results. Considering that existing Craig-simplex-identification (CSI) algorithms suffer from high computational complexity due to heavy simplex volume computations, our identifiability analysis results motivated us to devise a super fast CSI algorithm for nBSS without involving any simplex volume computations. Specifically, by exploiting a convex geometry fact that a simplest simplex of N vertices can be defined by N associated hyperplanes, we reconstruct Craig's simplex from N hyperplane estimates, where each hyperplane is estimated from N-1 affinely independent data pixels. Without resorting to numerical optimization, the proposed algorithm searches for the N(N-1) data pixels via simple linear algebraic computations, accounting for its computational efficiency. Besides an endmember identifiability analysis for its performance support, synthetic/real hyperspectral remote sensing (HRS) imaging data experiments are also provided to demonstrate its superior efficacy over state-of-the-art CSI algorithms in both computational efficiency and estimation accuracy. Finally, model-order selection (MOS), determining the number of sources N, is done based on an information theoretic-oriented minimum description length (MDL) criterion that {avoids data-dependent parameter tuning} (e.g., eigenvalue threshold). Instead of describing nBSS data via Gaussian competing models (which may be too simplified to advisably describe nBSS data) as in existing MDL-based frameworks, we consider more comprehensive modeling based on the fact that (standardized) nBSS data often can be configured as a simplex. Specifically, we employ a (linearly transformed) Dirichlet distribution to capture the simplex structure embedded in the noiseless counterpart of data, which, together with a Gaussian noise modeling, gives rise to Gaussian-Dirichlet convolution competing models. Then, maximum-likelihood (ML) estimates of the Gaussian-Dirichlet density are derived by building up a link between stochastic ML estimator and simplex geometry. Consequently, the corresponding description lengths are efficiently calculated by Monte Carlo integration. We validate our nBSS-MDL criterion through extensive simulations and experiments on real-world biomedical and HRS imaging datasets, to demonstrate its performance/applicability, and it consistently detects the true number of sources in all of our four case studies.
author2 Chi, Chong-Yung
author_facet Chi, Chong-Yung
Lin, Chia-Hsiang
林家祥
author Lin, Chia-Hsiang
林家祥
spellingShingle Lin, Chia-Hsiang
林家祥
Simplex Geometry Based Non-negative Blind Source Separation
author_sort Lin, Chia-Hsiang
title Simplex Geometry Based Non-negative Blind Source Separation
title_short Simplex Geometry Based Non-negative Blind Source Separation
title_full Simplex Geometry Based Non-negative Blind Source Separation
title_fullStr Simplex Geometry Based Non-negative Blind Source Separation
title_full_unstemmed Simplex Geometry Based Non-negative Blind Source Separation
title_sort simplex geometry based non-negative blind source separation
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/19118106298842310726
work_keys_str_mv AT linchiahsiang simplexgeometrybasednonnegativeblindsourceseparation
AT línjiāxiáng simplexgeometrybasednonnegativeblindsourceseparation
AT linchiahsiang jīyúdānxíngjǐhézhīfēifùmángbìxùnhàoyuánfēnlí
AT línjiāxiáng jīyúdānxíngjǐhézhīfēifùmángbìxùnhàoyuánfēnlí
_version_ 1718496840362491904
spelling ndltd-TW-104NTHU56501012017-07-16T04:29:26Z http://ndltd.ncl.edu.tw/handle/19118106298842310726 Simplex Geometry Based Non-negative Blind Source Separation 基於單形幾何之非負盲蔽訊號源分離 Lin, Chia-Hsiang 林家祥 博士 國立清華大學 通訊工程研究所 104 Non-negative blind source separation (nBSS), the focus of this dissertation, has found many successful applications in science and engineering, such as biomedical imaging, gene expression data analysis, and hyperspectral imaging in remote sensing. In contrast to conventional nBSS methods, including non-negative independent component analysis (nICA) and non-negative matrix factorization (NMF), we consider the nBSS problem from the perspective of simplex geometry without requiring sources' statistical independence and existence of pure pixel (fully contributed by a single source). The columns of the mixing matrix, describing how the non-negative sources are mixed, can be estimated by the vertices (also referred to as endmembers) of the minimum-volume simplex that encloses all pixel vectors---the well-known Craig's nBSS criterion. Empirical experience has suggested that Craig's criterion is capable of unmixing heavily mixed sources, but it was not clear why this is true from a theoretical viewpoint. Before we adopt this powerful criterion for devising a highly efficient and effective nBSS algorithm, we develop an analysis framework wherein the source mixing level (or data purity level) is quantitatively defined, and prove that Craig's criterion indeed can yield perfect endmember identifiability (in the noiseless scenario) as long as this quantity is greater than a {certain} small threshold. Our theoretical results are substantiated by numerical simulation results. Considering that existing Craig-simplex-identification (CSI) algorithms suffer from high computational complexity due to heavy simplex volume computations, our identifiability analysis results motivated us to devise a super fast CSI algorithm for nBSS without involving any simplex volume computations. Specifically, by exploiting a convex geometry fact that a simplest simplex of N vertices can be defined by N associated hyperplanes, we reconstruct Craig's simplex from N hyperplane estimates, where each hyperplane is estimated from N-1 affinely independent data pixels. Without resorting to numerical optimization, the proposed algorithm searches for the N(N-1) data pixels via simple linear algebraic computations, accounting for its computational efficiency. Besides an endmember identifiability analysis for its performance support, synthetic/real hyperspectral remote sensing (HRS) imaging data experiments are also provided to demonstrate its superior efficacy over state-of-the-art CSI algorithms in both computational efficiency and estimation accuracy. Finally, model-order selection (MOS), determining the number of sources N, is done based on an information theoretic-oriented minimum description length (MDL) criterion that {avoids data-dependent parameter tuning} (e.g., eigenvalue threshold). Instead of describing nBSS data via Gaussian competing models (which may be too simplified to advisably describe nBSS data) as in existing MDL-based frameworks, we consider more comprehensive modeling based on the fact that (standardized) nBSS data often can be configured as a simplex. Specifically, we employ a (linearly transformed) Dirichlet distribution to capture the simplex structure embedded in the noiseless counterpart of data, which, together with a Gaussian noise modeling, gives rise to Gaussian-Dirichlet convolution competing models. Then, maximum-likelihood (ML) estimates of the Gaussian-Dirichlet density are derived by building up a link between stochastic ML estimator and simplex geometry. Consequently, the corresponding description lengths are efficiently calculated by Monte Carlo integration. We validate our nBSS-MDL criterion through extensive simulations and experiments on real-world biomedical and HRS imaging datasets, to demonstrate its performance/applicability, and it consistently detects the true number of sources in all of our four case studies. Chi, Chong-Yung 祁忠勇 2016 學位論文 ; thesis 154 en_US