MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data

Abstract Single-cell RNA sequencing (scRNA-seq) provides new opportunities to characterize cell populations, typically accomplished through some type of clustering analysis. Estimation of the optimal cluster number (K) is a crucial step but often ignored. Our approach improves most current scRNA-seq...

Full description

Bibliographic Details
Main Authors: Siyao Liu, Aatish Thennavan, Joseph P. Garay, J. S. Marron, Charles M. Perou
Format: Article
Language:English
Published: BMC 2021-08-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-021-02445-5
id doaj-d13afc93c4184844a132493427d47b62
record_format Article
spelling doaj-d13afc93c4184844a132493427d47b622021-08-22T11:46:54ZengBMCGenome Biology1474-760X2021-08-0122112110.1186/s13059-021-02445-5MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing dataSiyao Liu0Aatish Thennavan1Joseph P. Garay2J. S. Marron3Charles M. Perou4Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel HillLineberger Comprehensive Cancer Center, University of North Carolina at Chapel HillDepartment of Surgery, Oregon Health & Science UniversityLineberger Comprehensive Cancer Center, University of North Carolina at Chapel HillLineberger Comprehensive Cancer Center, University of North Carolina at Chapel HillAbstract Single-cell RNA sequencing (scRNA-seq) provides new opportunities to characterize cell populations, typically accomplished through some type of clustering analysis. Estimation of the optimal cluster number (K) is a crucial step but often ignored. Our approach improves most current scRNA-seq cluster methods by providing an objective estimation of the number of groups using a multi-resolution perspective. MultiK is a tool for objective selection of insightful Ks and achieves high robustness through a consensus clustering approach. We demonstrate that MultiK identifies reproducible groups in scRNA-seq data, thus providing an objective means to estimating the number of possible groups or cell-type populations present.https://doi.org/10.1186/s13059-021-02445-5Single-cell RNA-seqClusteringMulti-scaleMulti-resolutionGenomicsReproducibility
collection DOAJ
language English
format Article
sources DOAJ
author Siyao Liu
Aatish Thennavan
Joseph P. Garay
J. S. Marron
Charles M. Perou
spellingShingle Siyao Liu
Aatish Thennavan
Joseph P. Garay
J. S. Marron
Charles M. Perou
MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data
Genome Biology
Single-cell RNA-seq
Clustering
Multi-scale
Multi-resolution
Genomics
Reproducibility
author_facet Siyao Liu
Aatish Thennavan
Joseph P. Garay
J. S. Marron
Charles M. Perou
author_sort Siyao Liu
title MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data
title_short MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data
title_full MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data
title_fullStr MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data
title_full_unstemmed MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data
title_sort multik: an automated tool to determine optimal cluster numbers in single-cell rna sequencing data
publisher BMC
series Genome Biology
issn 1474-760X
publishDate 2021-08-01
description Abstract Single-cell RNA sequencing (scRNA-seq) provides new opportunities to characterize cell populations, typically accomplished through some type of clustering analysis. Estimation of the optimal cluster number (K) is a crucial step but often ignored. Our approach improves most current scRNA-seq cluster methods by providing an objective estimation of the number of groups using a multi-resolution perspective. MultiK is a tool for objective selection of insightful Ks and achieves high robustness through a consensus clustering approach. We demonstrate that MultiK identifies reproducible groups in scRNA-seq data, thus providing an objective means to estimating the number of possible groups or cell-type populations present.
topic Single-cell RNA-seq
Clustering
Multi-scale
Multi-resolution
Genomics
Reproducibility
url https://doi.org/10.1186/s13059-021-02445-5
work_keys_str_mv AT siyaoliu multikanautomatedtooltodetermineoptimalclusternumbersinsinglecellrnasequencingdata
AT aatishthennavan multikanautomatedtooltodetermineoptimalclusternumbersinsinglecellrnasequencingdata
AT josephpgaray multikanautomatedtooltodetermineoptimalclusternumbersinsinglecellrnasequencingdata
AT jsmarron multikanautomatedtooltodetermineoptimalclusternumbersinsinglecellrnasequencingdata
AT charlesmperou multikanautomatedtooltodetermineoptimalclusternumbersinsinglecellrnasequencingdata
_version_ 1721199370003546112