Validation in Principal Components Analysis Applied to EEG Data

The well-known multivariate technique Principal Components Analysis (PCA) is usually applied to a sample, and so component scores are subjected to sampling variability. However, few studies address their stability, an important topic when the sample size is small. This work presents three validation...

Full description

Bibliographic Details
Main Authors: João Carlos G. D. Costa, Paulo José G. Da-Silva, Renan Moritz V. R. Almeida, Antonio Fernando C. Infantosi
Format: Article
Language:English
Published: Hindawi Limited 2014-01-01
Series:Computational and Mathematical Methods in Medicine
Online Access:http://dx.doi.org/10.1155/2014/413801
id doaj-630d0d34b7e54dada4714d24bb2c9163
record_format Article
spelling doaj-630d0d34b7e54dada4714d24bb2c91632020-11-25T00:24:43ZengHindawi LimitedComputational and Mathematical Methods in Medicine1748-670X1748-67182014-01-01201410.1155/2014/413801413801Validation in Principal Components Analysis Applied to EEG DataJoão Carlos G. D. Costa0Paulo José G. Da-Silva1Renan Moritz V. R. Almeida2Antonio Fernando C. Infantosi3Biomedical Engineering Program, COPPE, Federal University of Rio de Janeiro, P.O. Box 68510, 21941-972 Rio de Janeiro, RJ, BrazilBiomedical Engineering Program, COPPE, Federal University of Rio de Janeiro, P.O. Box 68510, 21941-972 Rio de Janeiro, RJ, BrazilBiomedical Engineering Program, COPPE, Federal University of Rio de Janeiro, P.O. Box 68510, 21941-972 Rio de Janeiro, RJ, BrazilBiomedical Engineering Program, COPPE, Federal University of Rio de Janeiro, P.O. Box 68510, 21941-972 Rio de Janeiro, RJ, BrazilThe well-known multivariate technique Principal Components Analysis (PCA) is usually applied to a sample, and so component scores are subjected to sampling variability. However, few studies address their stability, an important topic when the sample size is small. This work presents three validation procedures applied to PCA, based on confidence regions generated by a variant of a nonparametric bootstrap called the partial bootstrap: (i) the assessment of PC scores variability by the spread and overlapping of “confidence regions” plotted around these scores; (ii) the use of the confidence regions centroids as a validation set; and (iii) the definition of the number of nontrivial axes to be retained for analysis. The methods were applied to EEG data collected during a postural control protocol with twenty-four volunteers. Two axes were retained for analysis, with 91.6% of explained variance. Results showed that the area of the confidence regions provided useful insights on the variability of scores and suggested that some subjects were not distinguishable from others, which was not evident from the principal planes. In addition, potential outliers, initially suggested by an analysis of the first principal plane, could not be confirmed by the confidence regions.http://dx.doi.org/10.1155/2014/413801
collection DOAJ
language English
format Article
sources DOAJ
author João Carlos G. D. Costa
Paulo José G. Da-Silva
Renan Moritz V. R. Almeida
Antonio Fernando C. Infantosi
spellingShingle João Carlos G. D. Costa
Paulo José G. Da-Silva
Renan Moritz V. R. Almeida
Antonio Fernando C. Infantosi
Validation in Principal Components Analysis Applied to EEG Data
Computational and Mathematical Methods in Medicine
author_facet João Carlos G. D. Costa
Paulo José G. Da-Silva
Renan Moritz V. R. Almeida
Antonio Fernando C. Infantosi
author_sort João Carlos G. D. Costa
title Validation in Principal Components Analysis Applied to EEG Data
title_short Validation in Principal Components Analysis Applied to EEG Data
title_full Validation in Principal Components Analysis Applied to EEG Data
title_fullStr Validation in Principal Components Analysis Applied to EEG Data
title_full_unstemmed Validation in Principal Components Analysis Applied to EEG Data
title_sort validation in principal components analysis applied to eeg data
publisher Hindawi Limited
series Computational and Mathematical Methods in Medicine
issn 1748-670X
1748-6718
publishDate 2014-01-01
description The well-known multivariate technique Principal Components Analysis (PCA) is usually applied to a sample, and so component scores are subjected to sampling variability. However, few studies address their stability, an important topic when the sample size is small. This work presents three validation procedures applied to PCA, based on confidence regions generated by a variant of a nonparametric bootstrap called the partial bootstrap: (i) the assessment of PC scores variability by the spread and overlapping of “confidence regions” plotted around these scores; (ii) the use of the confidence regions centroids as a validation set; and (iii) the definition of the number of nontrivial axes to be retained for analysis. The methods were applied to EEG data collected during a postural control protocol with twenty-four volunteers. Two axes were retained for analysis, with 91.6% of explained variance. Results showed that the area of the confidence regions provided useful insights on the variability of scores and suggested that some subjects were not distinguishable from others, which was not evident from the principal planes. In addition, potential outliers, initially suggested by an analysis of the first principal plane, could not be confirmed by the confidence regions.
url http://dx.doi.org/10.1155/2014/413801
work_keys_str_mv AT joaocarlosgdcosta validationinprincipalcomponentsanalysisappliedtoeegdata
AT paulojosegdasilva validationinprincipalcomponentsanalysisappliedtoeegdata
AT renanmoritzvralmeida validationinprincipalcomponentsanalysisappliedtoeegdata
AT antoniofernandocinfantosi validationinprincipalcomponentsanalysisappliedtoeegdata
_version_ 1725352146690375680