A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data

Single-cell RNA sequencing (scRNA-seq) is a high-throughput sequencing technology performed at the level of an individual cell, which can have a potential to understand cellular heterogeneity. However, scRNA-seq data are high-dimensional, noisy, and sparse data. Dimension reduction is an important s...

Full description

Bibliographic Details
Main Authors: Ruizhi Xiang, Wencan Wang, Lei Yang, Shiyuan Wang, Chaohan Xu, Xiaowen Chen
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-03-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2021.646936/full
id doaj-4c0fab8de2eb4274a91e8aca0a3dcfd2
record_format Article
spelling doaj-4c0fab8de2eb4274a91e8aca0a3dcfd22021-03-23T06:10:11ZengFrontiers Media S.A.Frontiers in Genetics1664-80212021-03-011210.3389/fgene.2021.646936646936A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq DataRuizhi Xiang0Wencan Wang1Lei Yang2Shiyuan Wang3Chaohan Xu4Xiaowen Chen5College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, ChinaSchool of Optometry and Ophthalmology and Eye Hospital, Wenzhou Medical University, Wenzhou, ChinaCollege of Bioinformatics Science and Technology, Harbin Medical University, Harbin, ChinaCollege of Bioinformatics Science and Technology, Harbin Medical University, Harbin, ChinaCollege of Bioinformatics Science and Technology, Harbin Medical University, Harbin, ChinaCollege of Bioinformatics Science and Technology, Harbin Medical University, Harbin, ChinaSingle-cell RNA sequencing (scRNA-seq) is a high-throughput sequencing technology performed at the level of an individual cell, which can have a potential to understand cellular heterogeneity. However, scRNA-seq data are high-dimensional, noisy, and sparse data. Dimension reduction is an important step in downstream analysis of scRNA-seq. Therefore, several dimension reduction methods have been developed. We developed a strategy to evaluate the stability, accuracy, and computing cost of 10 dimensionality reduction methods using 30 simulation datasets and five real datasets. Additionally, we investigated the sensitivity of all the methods to hyperparameter tuning and gave users appropriate suggestions. We found that t-distributed stochastic neighbor embedding (t-SNE) yielded the best overall performance with the highest accuracy and computing cost. Meanwhile, uniform manifold approximation and projection (UMAP) exhibited the highest stability, as well as moderate accuracy and the second highest computing cost. UMAP well preserves the original cohesion and separation of cell populations. In addition, it is worth noting that users need to set the hyperparameters according to the specific situation before using the dimensionality reduction methods based on non-linear model and neural network.https://www.frontiersin.org/articles/10.3389/fgene.2021.646936/fullsingle-cell RNA-seqdimension reductionbenchmarksequences analysisdeep learning
collection DOAJ
language English
format Article
sources DOAJ
author Ruizhi Xiang
Wencan Wang
Lei Yang
Shiyuan Wang
Chaohan Xu
Xiaowen Chen
spellingShingle Ruizhi Xiang
Wencan Wang
Lei Yang
Shiyuan Wang
Chaohan Xu
Xiaowen Chen
A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data
Frontiers in Genetics
single-cell RNA-seq
dimension reduction
benchmark
sequences analysis
deep learning
author_facet Ruizhi Xiang
Wencan Wang
Lei Yang
Shiyuan Wang
Chaohan Xu
Xiaowen Chen
author_sort Ruizhi Xiang
title A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data
title_short A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data
title_full A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data
title_fullStr A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data
title_full_unstemmed A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data
title_sort comparison for dimensionality reduction methods of single-cell rna-seq data
publisher Frontiers Media S.A.
series Frontiers in Genetics
issn 1664-8021
publishDate 2021-03-01
description Single-cell RNA sequencing (scRNA-seq) is a high-throughput sequencing technology performed at the level of an individual cell, which can have a potential to understand cellular heterogeneity. However, scRNA-seq data are high-dimensional, noisy, and sparse data. Dimension reduction is an important step in downstream analysis of scRNA-seq. Therefore, several dimension reduction methods have been developed. We developed a strategy to evaluate the stability, accuracy, and computing cost of 10 dimensionality reduction methods using 30 simulation datasets and five real datasets. Additionally, we investigated the sensitivity of all the methods to hyperparameter tuning and gave users appropriate suggestions. We found that t-distributed stochastic neighbor embedding (t-SNE) yielded the best overall performance with the highest accuracy and computing cost. Meanwhile, uniform manifold approximation and projection (UMAP) exhibited the highest stability, as well as moderate accuracy and the second highest computing cost. UMAP well preserves the original cohesion and separation of cell populations. In addition, it is worth noting that users need to set the hyperparameters according to the specific situation before using the dimensionality reduction methods based on non-linear model and neural network.
topic single-cell RNA-seq
dimension reduction
benchmark
sequences analysis
deep learning
url https://www.frontiersin.org/articles/10.3389/fgene.2021.646936/full
work_keys_str_mv AT ruizhixiang acomparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT wencanwang acomparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT leiyang acomparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT shiyuanwang acomparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT chaohanxu acomparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT xiaowenchen acomparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT ruizhixiang comparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT wencanwang comparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT leiyang comparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT shiyuanwang comparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT chaohanxu comparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
AT xiaowenchen comparisonfordimensionalityreductionmethodsofsinglecellrnaseqdata
_version_ 1724206669572866048