On Eigenfunction Based Spatial Analysis for Outlier Detection in High-Dimensional Datasets

This paper is concerned with two methods, one based on eigenvalue analysis, and the other, a modified version of singular value decomposition (SVD) called pseudo-SVD, for detecting outliers in high-dimensional data sets. The eigenvalue analysis approach examines the spatial relationship among the co...

Full description

Bibliographic Details
Main Author: Atulya NAGAR
Format: Article
Language:English
Published: International Institute of Informatics and Cybernetics 2005-04-01
Series:Journal of Systemics, Cybernetics and Informatics
Subjects:
Online Access:http://www.iiisci.org/Journal/CV$/sci/pdfs/P602262.pdf
Description
Summary:This paper is concerned with two methods, one based on eigenvalue analysis, and the other, a modified version of singular value decomposition (SVD) called pseudo-SVD, for detecting outliers in high-dimensional data sets. The eigenvalue analysis approach examines the spatial relationship among the column vectors of object-attribute matrix to obtain an insight into the degree of inconsistency in a cluster of data. The pseudo-SVD method, in which the singular values are allowed to have a sign, looks at the direction of vectors in the object-attribute matrix and based on the degree of their orthogonality detects the outliers. The pseudo-SVD algorithm is formulated as an optimisation problem for clustering the data on the basis of their angular inclination. The methods have been applied to two case studies: one pertaining to a dermatological dataset and the other related to an engineering problem of state estimation. Further research directions are also discussed.
ISSN:1690-4524