On the use of algebraic topology concepts to check the consistency of genome assembly
This paper presents a preliminary work consisting of two contributions. The first one is the design of a very efficient algorithm based on an “Overlap-Layout-Consensus” (OLC) graph to assemble the long reads provided by 3rd generation technologies. The second concerns the analysis of this graph usin...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
The Biophysical Society of Japan
2019-11-01
|
Series: | Biophysics and Physicobiology |
Subjects: | |
Online Access: | https://doi.org/10.2142/biophysico.16.0_444 |
id |
doaj-f4ddaa2d898e40be9523a06b658781f0 |
---|---|
record_format |
Article |
spelling |
doaj-f4ddaa2d898e40be9523a06b658781f02020-11-25T02:58:41ZengThe Biophysical Society of JapanBiophysics and Physicobiology2189-47792019-11-011610.2142/biophysico.16.0_444On the use of algebraic topology concepts to check the consistency of genome assemblyJean-François Gibrat0MaIAGE, INRA, Université Paris-Saclay, Jouy-en-Josas 78350, FranceThis paper presents a preliminary work consisting of two contributions. The first one is the design of a very efficient algorithm based on an “Overlap-Layout-Consensus” (OLC) graph to assemble the long reads provided by 3rd generation technologies. The second concerns the analysis of this graph using algebraic topology concepts to determine, in advance, whether the assembly of the genome will be straightforward, i.e., whether it will lead to a pseudo-Hamiltonian path or cycle, or whether the results will need to be scrutinized. In the latter case, it will be necessary to look for “loops” in the OLC assembly graph caused by unresolved repeated genomic regions, and then try to untie the “knots” created by these regions.https://doi.org/10.2142/biophysico.16.0_444ngs technologiesolc assembly graphsgenomic repetitionshomology groupsbetti numbers |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jean-François Gibrat |
spellingShingle |
Jean-François Gibrat On the use of algebraic topology concepts to check the consistency of genome assembly Biophysics and Physicobiology ngs technologies olc assembly graphs genomic repetitions homology groups betti numbers |
author_facet |
Jean-François Gibrat |
author_sort |
Jean-François Gibrat |
title |
On the use of algebraic topology concepts to check the consistency of genome assembly |
title_short |
On the use of algebraic topology concepts to check the consistency of genome assembly |
title_full |
On the use of algebraic topology concepts to check the consistency of genome assembly |
title_fullStr |
On the use of algebraic topology concepts to check the consistency of genome assembly |
title_full_unstemmed |
On the use of algebraic topology concepts to check the consistency of genome assembly |
title_sort |
on the use of algebraic topology concepts to check the consistency of genome assembly |
publisher |
The Biophysical Society of Japan |
series |
Biophysics and Physicobiology |
issn |
2189-4779 |
publishDate |
2019-11-01 |
description |
This paper presents a preliminary work consisting of two contributions. The first one is the design of a very efficient algorithm based on an “Overlap-Layout-Consensus” (OLC) graph to assemble the long reads provided by 3rd generation technologies. The second concerns the analysis of this graph using algebraic topology concepts to determine, in advance, whether the assembly of the genome will be straightforward, i.e., whether it will lead to a pseudo-Hamiltonian path or cycle, or whether the results will need to be scrutinized. In the latter case, it will be necessary to look for “loops” in the OLC assembly graph caused by unresolved repeated genomic regions, and then try to untie the “knots” created by these regions. |
topic |
ngs technologies olc assembly graphs genomic repetitions homology groups betti numbers |
url |
https://doi.org/10.2142/biophysico.16.0_444 |
work_keys_str_mv |
AT jeanfrancoisgibrat ontheuseofalgebraictopologyconceptstochecktheconsistencyofgenomeassembly |
_version_ |
1724705600168787968 |