Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure
In this study, we used bootstrap simulation of a real data set to investigate the impact of sample size (N = 20, 30, 40 and 50) on the eigenvalues and eigenvectors resulting from principal component analysis (PCA). For each sample size, 100 bootstrap samples were drawn from environmental data matrix...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sciendo
2016-06-01
|
Series: | Ekológia (Bratislava) |
Subjects: | |
Online Access: | https://doi.org/10.1515/eko-2016-0014 |
id |
doaj-e1e8e13554fc47e7a9f812a8322392cb |
---|---|
record_format |
Article |
spelling |
doaj-e1e8e13554fc47e7a9f812a8322392cb2021-09-05T20:44:47ZengSciendoEkológia (Bratislava)1337-947X2016-06-0135217319010.1515/eko-2016-0014eko-2016-0014Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructureShaukat S. Shahid0Rao Toqeer Ahmed1Khan Moazzam A.2Institute of Environmental Studies, University of Karachi, Karachi-75270, PakistanDepartment of Botany, Federal Urdu University of Arts, Sciences & Technology, Karachi-75300, PakistanInstitute of Environmental Studies, University of Karachi, Karachi-75270, PakistanIn this study, we used bootstrap simulation of a real data set to investigate the impact of sample size (N = 20, 30, 40 and 50) on the eigenvalues and eigenvectors resulting from principal component analysis (PCA). For each sample size, 100 bootstrap samples were drawn from environmental data matrix pertaining to water quality variables (p = 22) of a small data set comprising of 55 samples (stations from where water samples were collected). Because in ecology and environmental sciences the data sets are invariably small owing to high cost of collection and analysis of samples, we restricted our study to relatively small sample sizes. We focused attention on comparison of first 6 eigenvectors and first 10 eigenvalues. Data sets were compared using agglomerative cluster analysis using Ward’s method that does not require any stringent distributional assumptions.https://doi.org/10.1515/eko-2016-0014eigenstructureenvironmental dataordinationpca |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Shaukat S. Shahid Rao Toqeer Ahmed Khan Moazzam A. |
spellingShingle |
Shaukat S. Shahid Rao Toqeer Ahmed Khan Moazzam A. Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure Ekológia (Bratislava) eigenstructure environmental data ordination pca |
author_facet |
Shaukat S. Shahid Rao Toqeer Ahmed Khan Moazzam A. |
author_sort |
Shaukat S. Shahid |
title |
Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure |
title_short |
Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure |
title_full |
Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure |
title_fullStr |
Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure |
title_full_unstemmed |
Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure |
title_sort |
impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure |
publisher |
Sciendo |
series |
Ekológia (Bratislava) |
issn |
1337-947X |
publishDate |
2016-06-01 |
description |
In this study, we used bootstrap simulation of a real data set to investigate the impact of sample size (N = 20, 30, 40 and 50) on the eigenvalues and eigenvectors resulting from principal component analysis (PCA). For each sample size, 100 bootstrap samples were drawn from environmental data matrix pertaining to water quality variables (p = 22) of a small data set comprising of 55 samples (stations from where water samples were collected). Because in ecology and environmental sciences the data sets are invariably small owing to high cost of collection and analysis of samples, we restricted our study to relatively small sample sizes. We focused attention on comparison of first 6 eigenvectors and first 10 eigenvalues. Data sets were compared using agglomerative cluster analysis using Ward’s method that does not require any stringent distributional assumptions. |
topic |
eigenstructure environmental data ordination pca |
url |
https://doi.org/10.1515/eko-2016-0014 |
work_keys_str_mv |
AT shaukatsshahid impactofsamplesizeonprincipalcomponentanalysisordinationofanenvironmentaldataseteffectsoneigenstructure AT raotoqeerahmed impactofsamplesizeonprincipalcomponentanalysisordinationofanenvironmentaldataseteffectsoneigenstructure AT khanmoazzama impactofsamplesizeonprincipalcomponentanalysisordinationofanenvironmentaldataseteffectsoneigenstructure |
_version_ |
1717785121041416192 |