HELLO: improved neural network architectures and methodologies for small variant calling

Background: Modern Next Generation- and Third Generation- Sequencing methods such as Illumina and PacBio Circular Consensus Sequencing platforms provide accurate sequencing data. Parallel developments in Deep Learning have enabled the application of Deep Neural Networks to variant calling, surpassin...

Full description

Bibliographic Details
Main Authors:	Chen, D. (Author), Klee, E.W (Author), Lumetta, S.S (Author), Ramachandran, A. (Author)
Format:	Article
Language:	English
Published:	BioMed Central Ltd 2021
Subjects:	article Classical approach deep learning Deep learning deep neural network Deep neural networks diagnostic test accuracy study high throughput sequencing High-accuracy High-Throughput Nucleotide Sequencing human Humans Hybrid variant calling Illumina Image recognition indel mutation INDEL Mutation Inference functions Method development molecular recognition Network architecture Neural networks Neural Networks, Computer PacBio Parallel development pipeline Pipelines Sequencing method Third generation Variant calling
Online Access:	View Fulltext in Publisher


LEADER	03462nam a2200565Ia 4500
001	10.1186-s12859-021-04311-4
008	220427s2021 CNT 000 0 und d
020			\|a 14712105 (ISSN)
245	1	0	\|a HELLO: improved neural network architectures and methodologies for small variant calling
260		0	\|b BioMed Central Ltd \|c 2021
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1186/s12859-021-04311-4
520	3		\|a Background: Modern Next Generation- and Third Generation- Sequencing methods such as Illumina and PacBio Circular Consensus Sequencing platforms provide accurate sequencing data. Parallel developments in Deep Learning have enabled the application of Deep Neural Networks to variant calling, surpassing the accuracy of classical approaches in many settings. DeepVariant, arguably the most popular among such methods, transforms the problem of variant calling into one of image recognition where a Deep Neural Network analyzes sequencing data that is formatted as images, achieving high accuracy. In this paper, we explore an alternative approach to designing Deep Neural Networks for variant calling, where we use meticulously designed Deep Neural Network architectures and customized variant inference functions that account for the underlying nature of sequencing data instead of converting the problem to one of image recognition. Results: Results from 27 whole-genome variant calling experiments spanning Illumina, PacBio and hybrid Illumina-PacBio settings suggest that our method allows vastly smaller Deep Neural Networks to outperform the Inception-v3 architecture used in DeepVariant for indel and substitution-type variant calls. For example, our method reduces the number of indel call errors by up to 18%, 55% and 65% for Illumina, PacBio and hybrid Illumina-PacBio variant calling respectively, compared to a similarly trained DeepVariant pipeline. In these cases, our models are between 7 and 14 times smaller. Conclusions: We believe that the improved accuracy and problem-specific customization of our models will enable more accurate pipelines and further method development in the field. HELLO is available at https://github.com/anands-repo/hello © 2021, The Author(s).
650	0	4	\|a article
650	0	4	\|a Classical approach
650	0	4	\|a deep learning
650	0	4	\|a Deep learning
650	0	4	\|a Deep learning
650	0	4	\|a deep neural network
650	0	4	\|a Deep neural networks
650	0	4	\|a Deep neural networks
650	0	4	\|a diagnostic test accuracy study
650	0	4	\|a high throughput sequencing
650	0	4	\|a High-accuracy
650	0	4	\|a High-Throughput Nucleotide Sequencing
650	0	4	\|a human
650	0	4	\|a Humans
650	0	4	\|a Hybrid variant calling
650	0	4	\|a Illumina
650	0	4	\|a Illumina
650	0	4	\|a Image recognition
650	0	4	\|a indel mutation
650	0	4	\|a INDEL Mutation
650	0	4	\|a Inference functions
650	0	4	\|a Method development
650	0	4	\|a molecular recognition
650	0	4	\|a Network architecture
650	0	4	\|a Neural networks
650	0	4	\|a Neural Networks, Computer
650	0	4	\|a PacBio
650	0	4	\|a Parallel development
650	0	4	\|a pipeline
650	0	4	\|a Pipelines
650	0	4	\|a Sequencing method
650	0	4	\|a Third generation
650	0	4	\|a Variant calling
700	1		\|a Chen, D. \|e author
700	1		\|a Klee, E.W. \|e author
700	1		\|a Lumetta, S.S. \|e author
700	1		\|a Ramachandran, A. \|e author
773			\|t BMC Bioinformatics

HELLO: improved neural network architectures and methodologies for small variant calling

Similar Items