Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly
Abstract Background De novo heterozygous assembly is an ongoing challenge requiring improved assembly approaches. In this study, three strategies were used to develop de novo Vitis vinifera ‘Sultanina’ genome assemblies for comparison with the inbred V. vinifera (PN40024 12X.v2) reference genome and...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2018-01-01
|
Series: | BMC Genomics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12864-018-4434-2 |
id |
doaj-c639a0d26de14d04a3ee230fa60e0c5e |
---|---|
record_format |
Article |
spelling |
doaj-c639a0d26de14d04a3ee230fa60e0c5e2020-11-24T21:43:38ZengBMCBMC Genomics1471-21642018-01-0119111210.1186/s12864-018-4434-2Comparison of three assembly strategies for a heterozygous seedless grapevine genome assemblySagar Patel0Zhixiu Lu1Xiaozhu Jin2Padmapriya Swaminathan3Erliang Zeng4Anne Y. Fennell5Agronomy, Horticulture and Plant Science Department and BioSNTR, 247 McFadden BioStress Laboratory, South Dakota State UniversityDepartment of Computer Science, University of South DakotaAgronomy, Horticulture and Plant Science Department and BioSNTR, 247 McFadden BioStress Laboratory, South Dakota State UniversityAgronomy, Horticulture and Plant Science Department and BioSNTR, 247 McFadden BioStress Laboratory, South Dakota State UniversityDepartment of Computer Science, University of South DakotaAgronomy, Horticulture and Plant Science Department and BioSNTR, 247 McFadden BioStress Laboratory, South Dakota State UniversityAbstract Background De novo heterozygous assembly is an ongoing challenge requiring improved assembly approaches. In this study, three strategies were used to develop de novo Vitis vinifera ‘Sultanina’ genome assemblies for comparison with the inbred V. vinifera (PN40024 12X.v2) reference genome and a published Sultanina ALLPATHS-LG assembly (AP). The strategies were: 1) a default PLATANUS assembly (PLAT_d) for direct comparison with AP assembly, 2) an iterative merging strategy using METASSEMBLER to combine PLAT_d and AP assemblies (MERGE) and 3) PLATANUS parameter modifications plus GapCloser (PLAT*_GC). Results The three new assemblies were greater in size than the AP assembly. PLAT*_GC had the greatest number of scaffolds aligning with a minimum of 95% identity and ≥1000 bp alignment length to V. vinifera (PN40024 12X.v2) reference genome. SNP analysis also identified additional high quality SNPs. A greater number of sequence reads mapped back with zero-mismatch to the PLAT_d, MERGE, and PLAT*_GC (>94%) than was found in the AP assembly (87%) indicating a greater fidelity to the original sequence data in the new assemblies than in AP assembly. A de novo gene prediction conducted using seedless RNA-seq data predicted > 30,000 coding sequences for the three new de novo assemblies, with the greatest number (30,544) in PLAT*_GC and only 26,515 for the AP assembly. Transcription factor analysis indicated good family coverage, but some genes found in the VCOST.v3 annotation were not identified in any of the de novo assemblies, particularly some from the MYB and ERF families. Conclusions The PLAT_d and PLAT*_GC had a greater number of synteny blocks with the V. vinifera (PN40024 12X.v2) reference genome than AP or MERGE. PLAT*_GC provided the most contiguous assembly with only 1.2% scaffold N, in contrast to AP (10.7% N), PLAT_d (6.6% N) and Merge (6.4% N). A PLAT*_GC pseudo-chromosome assembly with chromosome alignment to the reference genome V. vinifera, (PN40024 12X.v2) provides new information for use in seedless grape genetic mapping studies. An annotated de novo gene prediction for the PLAT*_GC assembly, aligned with VitisNet pathways provides new seedless grapevine specific transcriptomic resource that has excellent fidelity with the seedless short read sequence data.http://link.springer.com/article/10.1186/s12864-018-4434-2De novo genome assemblyHeterozygousVitis viniferaSeedless grapeSultaninaPLATANUS |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Sagar Patel Zhixiu Lu Xiaozhu Jin Padmapriya Swaminathan Erliang Zeng Anne Y. Fennell |
spellingShingle |
Sagar Patel Zhixiu Lu Xiaozhu Jin Padmapriya Swaminathan Erliang Zeng Anne Y. Fennell Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly BMC Genomics De novo genome assembly Heterozygous Vitis vinifera Seedless grape Sultanina PLATANUS |
author_facet |
Sagar Patel Zhixiu Lu Xiaozhu Jin Padmapriya Swaminathan Erliang Zeng Anne Y. Fennell |
author_sort |
Sagar Patel |
title |
Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly |
title_short |
Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly |
title_full |
Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly |
title_fullStr |
Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly |
title_full_unstemmed |
Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly |
title_sort |
comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly |
publisher |
BMC |
series |
BMC Genomics |
issn |
1471-2164 |
publishDate |
2018-01-01 |
description |
Abstract Background De novo heterozygous assembly is an ongoing challenge requiring improved assembly approaches. In this study, three strategies were used to develop de novo Vitis vinifera ‘Sultanina’ genome assemblies for comparison with the inbred V. vinifera (PN40024 12X.v2) reference genome and a published Sultanina ALLPATHS-LG assembly (AP). The strategies were: 1) a default PLATANUS assembly (PLAT_d) for direct comparison with AP assembly, 2) an iterative merging strategy using METASSEMBLER to combine PLAT_d and AP assemblies (MERGE) and 3) PLATANUS parameter modifications plus GapCloser (PLAT*_GC). Results The three new assemblies were greater in size than the AP assembly. PLAT*_GC had the greatest number of scaffolds aligning with a minimum of 95% identity and ≥1000 bp alignment length to V. vinifera (PN40024 12X.v2) reference genome. SNP analysis also identified additional high quality SNPs. A greater number of sequence reads mapped back with zero-mismatch to the PLAT_d, MERGE, and PLAT*_GC (>94%) than was found in the AP assembly (87%) indicating a greater fidelity to the original sequence data in the new assemblies than in AP assembly. A de novo gene prediction conducted using seedless RNA-seq data predicted > 30,000 coding sequences for the three new de novo assemblies, with the greatest number (30,544) in PLAT*_GC and only 26,515 for the AP assembly. Transcription factor analysis indicated good family coverage, but some genes found in the VCOST.v3 annotation were not identified in any of the de novo assemblies, particularly some from the MYB and ERF families. Conclusions The PLAT_d and PLAT*_GC had a greater number of synteny blocks with the V. vinifera (PN40024 12X.v2) reference genome than AP or MERGE. PLAT*_GC provided the most contiguous assembly with only 1.2% scaffold N, in contrast to AP (10.7% N), PLAT_d (6.6% N) and Merge (6.4% N). A PLAT*_GC pseudo-chromosome assembly with chromosome alignment to the reference genome V. vinifera, (PN40024 12X.v2) provides new information for use in seedless grape genetic mapping studies. An annotated de novo gene prediction for the PLAT*_GC assembly, aligned with VitisNet pathways provides new seedless grapevine specific transcriptomic resource that has excellent fidelity with the seedless short read sequence data. |
topic |
De novo genome assembly Heterozygous Vitis vinifera Seedless grape Sultanina PLATANUS |
url |
http://link.springer.com/article/10.1186/s12864-018-4434-2 |
work_keys_str_mv |
AT sagarpatel comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly AT zhixiulu comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly AT xiaozhujin comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly AT padmapriyaswaminathan comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly AT erliangzeng comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly AT anneyfennell comparisonofthreeassemblystrategiesforaheterozygousseedlessgrapevinegenomeassembly |
_version_ |
1725912974783152128 |