Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly

Abstract Background De novo heterozygous assembly is an ongoing challenge requiring improved assembly approaches. In this study, three strategies were used to develop de novo Vitis vinifera ‘Sultanina’ genome assemblies for comparison with the inbred V. vinifera (PN40024 12X.v2) reference genome and...

Full description

Bibliographic Details
Main Authors: Sagar Patel, Zhixiu Lu, Xiaozhu Jin, Padmapriya Swaminathan, Erliang Zeng, Anne Y. Fennell
Format: Article
Language:English
Published: BMC 2018-01-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-018-4434-2
id doaj-c639a0d26de14d04a3ee230fa60e0c5e
record_format Article
spelling doaj-c639a0d26de14d04a3ee230fa60e0c5e2020-11-24T21:43:38ZengBMCBMC Genomics1471-21642018-01-0119111210.1186/s12864-018-4434-2Comparison of three assembly strategies for a heterozygous seedless grapevine genome assemblySagar Patel0Zhixiu Lu1Xiaozhu Jin2Padmapriya Swaminathan3Erliang Zeng4Anne Y. Fennell5Agronomy, Horticulture and Plant Science Department and BioSNTR, 247 McFadden BioStress Laboratory, South Dakota State UniversityDepartment of Computer Science, University of South DakotaAgronomy, Horticulture and Plant Science Department and BioSNTR, 247 McFadden BioStress Laboratory, South Dakota State UniversityAgronomy, Horticulture and Plant Science Department and BioSNTR, 247 McFadden BioStress Laboratory, South Dakota State UniversityDepartment of Computer Science, University of South DakotaAgronomy, Horticulture and Plant Science Department and BioSNTR, 247 McFadden BioStress Laboratory, South Dakota State UniversityAbstract Background De novo heterozygous assembly is an ongoing challenge requiring improved assembly approaches. In this study, three strategies were used to develop de novo Vitis vinifera ‘Sultanina’ genome assemblies for comparison with the inbred V. vinifera (PN40024 12X.v2) reference genome and a published Sultanina ALLPATHS-LG assembly (AP). The strategies were: 1) a default PLATANUS assembly (PLAT_d) for direct comparison with AP assembly, 2) an iterative merging strategy using METASSEMBLER to combine PLAT_d and AP assemblies (MERGE) and 3) PLATANUS parameter modifications plus GapCloser (PLAT*_GC). Results The three new assemblies were greater in size than the AP assembly. PLAT*_GC had the greatest number of scaffolds aligning with a minimum of 95% identity and ≥1000 bp alignment length to V. vinifera (PN40024 12X.v2) reference genome. SNP analysis also identified additional high quality SNPs. A greater number of sequence reads mapped back with zero-mismatch to the PLAT_d, MERGE, and PLAT*_GC (>94%) than was found in the AP assembly (87%) indicating a greater fidelity to the original sequence data in the new assemblies than in AP assembly. A de novo gene prediction conducted using seedless RNA-seq data predicted > 30,000 coding sequences for the three new de novo assemblies, with the greatest number (30,544) in PLAT*_GC and only 26,515 for the AP assembly. Transcription factor analysis indicated good family coverage, but some genes found in the VCOST.v3 annotation were not identified in any of the de novo assemblies, particularly some from  the MYB and ERF families. Conclusions The PLAT_d and PLAT*_GC had a greater number of synteny blocks with the V. vinifera (PN40024 12X.v2) reference genome than AP or MERGE. PLAT*_GC provided the most contiguous assembly with only 1.2% scaffold N, in contrast to AP (10.7% N), PLAT_d (6.6% N) and Merge (6.4% N). A PLAT*_GC pseudo-chromosome assembly with chromosome alignment to the reference genome V. vinifera, (PN40024 12X.v2) provides new information for use in seedless grape genetic mapping studies. An annotated de novo gene prediction for the PLAT*_GC assembly, aligned with VitisNet pathways provides new seedless grapevine specific transcriptomic resource that has excellent fidelity with the seedless short read sequence data.http://link.springer.com/article/10.1186/s12864-018-4434-2De novo genome assemblyHeterozygousVitis viniferaSeedless grapeSultaninaPLATANUS
collection DOAJ
language English
format Article
sources DOAJ
author Sagar Patel
Zhixiu Lu
Xiaozhu Jin
Padmapriya Swaminathan
Erliang Zeng
Anne Y. Fennell
spellingShingle Sagar Patel
Zhixiu Lu
Xiaozhu Jin
Padmapriya Swaminathan
Erliang Zeng
Anne Y. Fennell
Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly
BMC Genomics
De novo genome assembly
Heterozygous
Vitis vinifera
Seedless grape
Sultanina
PLATANUS
author_facet Sagar Patel
Zhixiu Lu
Xiaozhu Jin
Padmapriya Swaminathan
Erliang Zeng
Anne Y. Fennell
author_sort Sagar Patel
title Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly
title_short Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly
title_full Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly
title_fullStr Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly
title_full_unstemmed Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly
title_sort comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2018-01-01
description Abstract Background De novo heterozygous assembly is an ongoing challenge requiring improved assembly approaches. In this study, three strategies were used to develop de novo Vitis vinifera ‘Sultanina’ genome assemblies for comparison with the inbred V. vinifera (PN40024 12X.v2) reference genome and a published Sultanina ALLPATHS-LG assembly (AP). The strategies were: 1) a default PLATANUS assembly (PLAT_d) for direct comparison with AP assembly, 2) an iterative merging strategy using METASSEMBLER to combine PLAT_d and AP assemblies (MERGE) and 3) PLATANUS parameter modifications plus GapCloser (PLAT*_GC). Results The three new assemblies were greater in size than the AP assembly. PLAT*_GC had the greatest number of scaffolds aligning with a minimum of 95% identity and ≥1000 bp alignment length to V. vinifera (PN40024 12X.v2) reference genome. SNP analysis also identified additional high quality SNPs. A greater number of sequence reads mapped back with zero-mismatch to the PLAT_d, MERGE, and PLAT*_GC (>94%) than was found in the AP assembly (87%) indicating a greater fidelity to the original sequence data in the new assemblies than in AP assembly. A de novo gene prediction conducted using seedless RNA-seq data predicted > 30,000 coding sequences for the three new de novo assemblies, with the greatest number (30,544) in PLAT*_GC and only 26,515 for the AP assembly. Transcription factor analysis indicated good family coverage, but some genes found in the VCOST.v3 annotation were not identified in any of the de novo assemblies, particularly some from  the MYB and ERF families. Conclusions The PLAT_d and PLAT*_GC had a greater number of synteny blocks with the V. vinifera (PN40024 12X.v2) reference genome than AP or MERGE. PLAT*_GC provided the most contiguous assembly with only 1.2% scaffold N, in contrast to AP (10.7% N), PLAT_d (6.6% N) and Merge (6.4% N). A PLAT*_GC pseudo-chromosome assembly with chromosome alignment to the reference genome V. vinifera, (PN40024 12X.v2) provides new information for use in seedless grape genetic mapping studies. An annotated de novo gene prediction for the PLAT*_GC assembly, aligned with VitisNet pathways provides new seedless grapevine specific transcriptomic resource that has excellent fidelity with the seedless short read sequence data.
topic De novo genome assembly
Heterozygous
Vitis vinifera
Seedless grape
Sultanina
PLATANUS
url http://link.springer.com/article/10.1186/s12864-018-4434-2
work_keys_str_mv AT sagarpatel comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly
AT zhixiulu comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly
AT xiaozhujin comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly
AT padmapriyaswaminathan comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly
AT erliangzeng comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly
AT anneyfennell comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly
_version_ 1725912974783152128