Memory efficient PCA methods for large group ICA

Principal component analysis (PCA) is widely used for data reduction in group independent component analysis (ICA) of fMRI data. Commonly, group-level PCA of temporally concatenated datasets is computed prior to ICA of the group principal components. This work focuses on reducing very high dimension...

Full description

Bibliographic Details
Main Authors:	Srinivas eRachakonda, Rogers F Silva, Jingyu eLiu, Vince D Calhoun
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2016-02-01
Series:	Frontiers in Neuroscience
Subjects:	Memory big data group ICA PCA SVD Evd
Online Access:	http://journal.frontiersin.org/Journal/10.3389/fnins.2016.00017/full

id	doaj-ce4ae936958a4657ac0597fe021b381a
record_format	Article
spelling	doaj-ce4ae936958a4657ac0597fe021b381a2020-11-24T22:58:09ZengFrontiers Media S.A.Frontiers in Neuroscience1662-453X2016-02-011010.3389/fnins.2016.00017171785Memory efficient PCA methods for large group ICASrinivas eRachakonda0Rogers F Silva1Rogers F Silva2Jingyu eLiu3Vince D Calhoun4Vince D Calhoun5Vince D Calhoun6The MIND Research Network & LBERIThe MIND Research Network & LBERIUniversity Of New MexicoThe MIND Research Network & LBERIThe MIND Research Network & LBERIUniversity Of New MexicoUniversity Of New MexicoPrincipal component analysis (PCA) is widely used for data reduction in group independent component analysis (ICA) of fMRI data. Commonly, group-level PCA of temporally concatenated datasets is computed prior to ICA of the group principal components. This work focuses on reducing very high dimensional temporally concatenated datasets into its group PCA space. Existing randomized PCA methods can determine the PCA subspace with minimal memory requirements and, thus, are ideal for solving large PCA problems. Since the number of dataloads is not typically optimized, we extend one of these methods to compute PCA of very large datasets with a minimal number of dataloads. This method is coined multi power iteration (MPOWIT). The key idea behind MPOWIT is to estimate a subspace larger than the desired one, while checking for convergence of only the smaller subset of interest. The number of iterations is reduced considerably (as well as the number of dataloads), accelerating convergence without loss of accuracy. More importantly, in the proposed implementation of MPOWIT, the memory required for successful recovery of the group principal components becomes independent of the number of subjects analyzed. Highly efficient subsampled eigenvalue decomposition techniques are also introduced, furnishing excellent PCA subspace approximations that can be used for intelligent initialization of randomized methods such as MPOWIT. Together, these developments enable efficient estimation of accurate principal components, as we illustrate by solving a 1600-subject group-level PCA of fMRI with standard acquisition parameters, on a regular desktop computer with only 4GB RAM, in just a few hours. MPOWIT is also highly scalable and could realistically solve group-level PCA of fMRI on thousands of subjects, or more, using standard hardware, limited only by time, not memory. Also, the MPOWIT algorithm is highly parallelizable, which would enable fast, distributed implementations ideal for big data analysis. Implications to other methods such as expectation maximization PCA (EM PCA) are also presented. Based on our results, general recommendations for efficient application of PCA methods are given according to problem size and available computational resources. MPOWIT and all other methods discussed here are implemented and readily available in the open source GIFT software.http://journal.frontiersin.org/Journal/10.3389/fnins.2016.00017/fullMemorybig datagroup ICAPCASVDEvd
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Srinivas eRachakonda Rogers F Silva Rogers F Silva Jingyu eLiu Vince D Calhoun Vince D Calhoun Vince D Calhoun
spellingShingle	Srinivas eRachakonda Rogers F Silva Rogers F Silva Jingyu eLiu Vince D Calhoun Vince D Calhoun Vince D Calhoun Memory efficient PCA methods for large group ICA Frontiers in Neuroscience Memory big data group ICA PCA SVD Evd
author_facet	Srinivas eRachakonda Rogers F Silva Rogers F Silva Jingyu eLiu Vince D Calhoun Vince D Calhoun Vince D Calhoun
author_sort	Srinivas eRachakonda
title	Memory efficient PCA methods for large group ICA
title_short	Memory efficient PCA methods for large group ICA
title_full	Memory efficient PCA methods for large group ICA
title_fullStr	Memory efficient PCA methods for large group ICA
title_full_unstemmed	Memory efficient PCA methods for large group ICA
title_sort	memory efficient pca methods for large group ica
publisher	Frontiers Media S.A.
series	Frontiers in Neuroscience
issn	1662-453X
publishDate	2016-02-01
description	Principal component analysis (PCA) is widely used for data reduction in group independent component analysis (ICA) of fMRI data. Commonly, group-level PCA of temporally concatenated datasets is computed prior to ICA of the group principal components. This work focuses on reducing very high dimensional temporally concatenated datasets into its group PCA space. Existing randomized PCA methods can determine the PCA subspace with minimal memory requirements and, thus, are ideal for solving large PCA problems. Since the number of dataloads is not typically optimized, we extend one of these methods to compute PCA of very large datasets with a minimal number of dataloads. This method is coined multi power iteration (MPOWIT). The key idea behind MPOWIT is to estimate a subspace larger than the desired one, while checking for convergence of only the smaller subset of interest. The number of iterations is reduced considerably (as well as the number of dataloads), accelerating convergence without loss of accuracy. More importantly, in the proposed implementation of MPOWIT, the memory required for successful recovery of the group principal components becomes independent of the number of subjects analyzed. Highly efficient subsampled eigenvalue decomposition techniques are also introduced, furnishing excellent PCA subspace approximations that can be used for intelligent initialization of randomized methods such as MPOWIT. Together, these developments enable efficient estimation of accurate principal components, as we illustrate by solving a 1600-subject group-level PCA of fMRI with standard acquisition parameters, on a regular desktop computer with only 4GB RAM, in just a few hours. MPOWIT is also highly scalable and could realistically solve group-level PCA of fMRI on thousands of subjects, or more, using standard hardware, limited only by time, not memory. Also, the MPOWIT algorithm is highly parallelizable, which would enable fast, distributed implementations ideal for big data analysis. Implications to other methods such as expectation maximization PCA (EM PCA) are also presented. Based on our results, general recommendations for efficient application of PCA methods are given according to problem size and available computational resources. MPOWIT and all other methods discussed here are implemented and readily available in the open source GIFT software.
topic	Memory big data group ICA PCA SVD Evd
url	http://journal.frontiersin.org/Journal/10.3389/fnins.2016.00017/full
work_keys_str_mv	AT srinivaserachakonda memoryefficientpcamethodsforlargegroupica AT rogersfsilva memoryefficientpcamethodsforlargegroupica AT rogersfsilva memoryefficientpcamethodsforlargegroupica AT jingyueliu memoryefficientpcamethodsforlargegroupica AT vincedcalhoun memoryefficientpcamethodsforlargegroupica AT vincedcalhoun memoryefficientpcamethodsforlargegroupica AT vincedcalhoun memoryefficientpcamethodsforlargegroupica
_version_	1725648260562944000

Memory efficient PCA methods for large group ICA

Similar Items