Uniform Manifold Approximation and Projection (UMAP) reveals composite patterns and resolves visualization artifacts in microbiome data

Microbiome data are sparse and high dimensional, so effective visualization of these data requires dimensionality reduction. To date, the most commonly used method for dimensionality reduction in the microbiome is calculation of between-sample microbial differences (beta diversity), followed by prin...

Full description

Bibliographic Details
Main Authors: Armstrong, G. (Author), Gonzalez, A. (Author), Knight, R. (Author), Martino, C. (Author), Mishne, G. (Author), Rahman, G. (Author), Vázquez-Baeza, Y. (Author)
Format: Article
Language:English
Published: American Society for Microbiology 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02935nam a2200373Ia 4500
001 10.1128-mSystems.00691-21
008 220427s2021 CNT 000 0 und d
020 |a 23795077 (ISSN) 
245 1 0 |a Uniform Manifold Approximation and Projection (UMAP) reveals composite patterns and resolves visualization artifacts in microbiome data 
260 0 |b American Society for Microbiology  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1128/mSystems.00691-21 
520 3 |a Microbiome data are sparse and high dimensional, so effective visualization of these data requires dimensionality reduction. To date, the most commonly used method for dimensionality reduction in the microbiome is calculation of between-sample microbial differences (beta diversity), followed by principal-coordinate analysis (PCoA). Uniform Manifold Approximation and Projection (UMAP) is an alternative method that can reduce the dimensionality of beta diversity distance matrices. Here, we demonstrate the benefits and limitations of using UMAP for dimensionality reduction on microbiome data. Using real data, we demonstrate that UMAP can improve the representation of clusters, especially when the clusters are composed of multiple subgroups. Additionally, we show that UMAP provides improved correlation of biological variation along a gradient with a reduced number of coordinates of the resulting embedding. Finally, we provide parameter recommendations that emphasize the preservation of global geometry. We therefore conclude that UMAP should be routinely used as a complementary visualization method for microbiome beta diversity studies. IMPORTANCE UMAP provides an additional method to visualize microbiome data. The method is extensible to any beta diversity metric used with PCoA, and our results demonstrate that UMAP can indeed improve visualization quality and correspondence with biological and technical variables of interest. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/ knightlab-analyses/umap-microbiome-benchmarking; additionally, we have provided a QIIME 2 plugin for UMAP at https://github.com/biocore/q2-umap. Copyright © 2021 Armstrong et al. 
650 0 4 |a article 
650 0 4 |a benchmarking 
650 0 4 |a Beta diversity, dimensionality reduction 
650 0 4 |a biological variation 
650 0 4 |a dimensionality reduction 
650 0 4 |a embedding 
650 0 4 |a geometry 
650 0 4 |a image artifact 
650 0 4 |a licence 
650 0 4 |a microbiome 
650 0 4 |a nonhuman 
650 0 4 |a principal coordinate analysis 
650 0 4 |a software 
650 0 4 |a writing 
700 1 |a Armstrong, G.  |e author 
700 1 |a Gonzalez, A.  |e author 
700 1 |a Knight, R.  |e author 
700 1 |a Martino, C.  |e author 
700 1 |a Mishne, G.  |e author 
700 1 |a Rahman, G.  |e author 
700 1 |a Vázquez-Baeza, Y.  |e author 
773 |t mSystems