Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae

Structural variation (SV) represents a major form of genetic variations that contribute to polymorphic variations, human diseases, and phenotypes in many organisms. Long-read sequencing has been successfully used to identify novel and complex SVs. However, comparison of SV detection tools for long-r...

Full description

Bibliographic Details
Main Authors: Mei-Wei Luan, Xiao-Ming Zhang, Zi-Bin Zhu, Ying Chen, Shang-Qian Xie
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-03-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fgene.2020.00159/full
id doaj-fd59dc0c9ed44ae395510dd503c29158
record_format Article
spelling doaj-fd59dc0c9ed44ae395510dd503c291582020-11-25T01:55:19ZengFrontiers Media S.A.Frontiers in Genetics1664-80212020-03-011110.3389/fgene.2020.00159510642Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiaeMei-Wei Luan0Xiao-Ming Zhang1Zi-Bin Zhu2Ying Chen3Shang-Qian Xie4Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Hainan Key Laboratory for Biology of Tropical Ornamental Plant Germplasm, College of Forestry, Hainan University, Haikou, ChinaCollege of Grassland, Resources and Environment, Inner Mongolia Agricultural University, Huhhot, ChinaKey Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Hainan Key Laboratory for Biology of Tropical Ornamental Plant Germplasm, College of Forestry, Hainan University, Haikou, ChinaState Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, ChinaKey Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Hainan Key Laboratory for Biology of Tropical Ornamental Plant Germplasm, College of Forestry, Hainan University, Haikou, ChinaStructural variation (SV) represents a major form of genetic variations that contribute to polymorphic variations, human diseases, and phenotypes in many organisms. Long-read sequencing has been successfully used to identify novel and complex SVs. However, comparison of SV detection tools for long-read sequencing datasets has not been reported. Therefore, we developed an analysis workflow that combined two alignment tools (NGMLR and minimap2) and five callers (Sniffles, Picky, smartie-sv, PBHoney, and NanoSV) to evaluate the SV detection in six datasets of Saccharomyces cerevisiae. The accuracy of SV regions was validated by re-aligning raw reads in diverse alignment tools, SV callers, experimental conditions, and sequencing platforms. The results showed that SV detection between NGMLR and minimap2 was not significant when using the same caller. The PBHoney was with the highest average accuracy (89.04%) and Picky has the lowest average accuracy (35.85%). The accuracy of NanoSV, Sniffles, and smartie-sv was 68.67%, 60.47%, and 57.67%, respectively. In addition, smartie-sv and NanoSV detected the most and least number of SVs, and SV detection from the PacBio sequencing platform was significantly more than that from ONT (p = 0.000173).https://www.frontiersin.org/article/10.3389/fgene.2020.00159/fullstructural variationlong-read sequencingPacBio and ONTSV callerSaccharomyces cerevisiae
collection DOAJ
language English
format Article
sources DOAJ
author Mei-Wei Luan
Xiao-Ming Zhang
Zi-Bin Zhu
Ying Chen
Shang-Qian Xie
spellingShingle Mei-Wei Luan
Xiao-Ming Zhang
Zi-Bin Zhu
Ying Chen
Shang-Qian Xie
Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
Frontiers in Genetics
structural variation
long-read sequencing
PacBio and ONT
SV caller
Saccharomyces cerevisiae
author_facet Mei-Wei Luan
Xiao-Ming Zhang
Zi-Bin Zhu
Ying Chen
Shang-Qian Xie
author_sort Mei-Wei Luan
title Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_short Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_full Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_fullStr Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_full_unstemmed Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae
title_sort evaluating structural variation detection tools for long-read sequencing datasets in saccharomyces cerevisiae
publisher Frontiers Media S.A.
series Frontiers in Genetics
issn 1664-8021
publishDate 2020-03-01
description Structural variation (SV) represents a major form of genetic variations that contribute to polymorphic variations, human diseases, and phenotypes in many organisms. Long-read sequencing has been successfully used to identify novel and complex SVs. However, comparison of SV detection tools for long-read sequencing datasets has not been reported. Therefore, we developed an analysis workflow that combined two alignment tools (NGMLR and minimap2) and five callers (Sniffles, Picky, smartie-sv, PBHoney, and NanoSV) to evaluate the SV detection in six datasets of Saccharomyces cerevisiae. The accuracy of SV regions was validated by re-aligning raw reads in diverse alignment tools, SV callers, experimental conditions, and sequencing platforms. The results showed that SV detection between NGMLR and minimap2 was not significant when using the same caller. The PBHoney was with the highest average accuracy (89.04%) and Picky has the lowest average accuracy (35.85%). The accuracy of NanoSV, Sniffles, and smartie-sv was 68.67%, 60.47%, and 57.67%, respectively. In addition, smartie-sv and NanoSV detected the most and least number of SVs, and SV detection from the PacBio sequencing platform was significantly more than that from ONT (p = 0.000173).
topic structural variation
long-read sequencing
PacBio and ONT
SV caller
Saccharomyces cerevisiae
url https://www.frontiersin.org/article/10.3389/fgene.2020.00159/full
work_keys_str_mv AT meiweiluan evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae
AT xiaomingzhang evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae
AT zibinzhu evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae
AT yingchen evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae
AT shangqianxie evaluatingstructuralvariationdetectiontoolsforlongreadsequencingdatasetsinsaccharomycescerevisiae
_version_ 1724983976255291392