Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads

Summary: One of the remaining challenges to describing an individual's genetic variation lies in the highly heterogeneous and complex genomic regions that impede the use of classical reference-guided mapping and assembly approaches. Once such region is the Immunoglobulin heavy chain locus (IGH)...

Full description

Bibliographic Details
Main Authors:	Michael Ford, Ehsan Haghshenas, Corey T. Watson, S. Cenk Sahinalp
Format:	Article
Language:	English
Published:	Elsevier 2020-03-01
Series:	iScience
Online Access:	http://www.sciencedirect.com/science/article/pii/S2589004220300675

id	doaj-8015ce18506541f8bbcb679a4cc47fda
record_format	Article
spelling	doaj-8015ce18506541f8bbcb679a4cc47fda2020-11-25T02:40:45ZengElsevieriScience2589-00422020-03-01233Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long ReadsMichael Ford0Ehsan Haghshenas1Corey T. Watson2S. Cenk Sahinalp3School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, CanadaSchool of Computing Science, Simon Fraser University, Burnaby V5A 1S6, CanadaDepartment of Biochemistry and Molecular Genetics, University of Louisville, Louisville 40292, USACancer Data Science Laboratory, National Cancer Institute, Bethesda, MD 20892, USA; Corresponding authorSummary: One of the remaining challenges to describing an individual's genetic variation lies in the highly heterogeneous and complex genomic regions that impede the use of classical reference-guided mapping and assembly approaches. Once such region is the Immunoglobulin heavy chain locus (IGH), which is critical for the development of antibodies and the adaptive immune system. We describe ImmunoTyper, the first PacBio-based genotyping and copy number calling tool specifically designed for IGH V genes (IGHV). We demonstrate that ImmunoTyper's multi-stage clustering and combinatorial optimization approach represents the most comprehensive IGHV genotyping approach published to date, through validation using gold-standard IGH reference sequence. This preliminary work establishes the feasibility of fine-grained genotype and copy number analysis using error-prone long reads in complex multi-gene loci and opens the door for in-depth investigation into IGHV heterogeneity using accessible and increasingly common whole-genome sequence. : Biological Sciences; Bioinformatics; Computational Bioinformatics; Genomic Analysis Subject Areas: Biological Sciences, Bioinformatics, Computational Bioinformatics, Genomic Analysishttp://www.sciencedirect.com/science/article/pii/S2589004220300675
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Michael Ford Ehsan Haghshenas Corey T. Watson S. Cenk Sahinalp
spellingShingle	Michael Ford Ehsan Haghshenas Corey T. Watson S. Cenk Sahinalp Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads iScience
author_facet	Michael Ford Ehsan Haghshenas Corey T. Watson S. Cenk Sahinalp
author_sort	Michael Ford
title	Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads
title_short	Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads
title_full	Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads
title_fullStr	Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads
title_full_unstemmed	Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads
title_sort	genotyping and copy number analysis of immunoglobin heavy chain variable genes using long reads
publisher	Elsevier
series	iScience
issn	2589-0042
publishDate	2020-03-01
description	Summary: One of the remaining challenges to describing an individual's genetic variation lies in the highly heterogeneous and complex genomic regions that impede the use of classical reference-guided mapping and assembly approaches. Once such region is the Immunoglobulin heavy chain locus (IGH), which is critical for the development of antibodies and the adaptive immune system. We describe ImmunoTyper, the first PacBio-based genotyping and copy number calling tool specifically designed for IGH V genes (IGHV). We demonstrate that ImmunoTyper's multi-stage clustering and combinatorial optimization approach represents the most comprehensive IGHV genotyping approach published to date, through validation using gold-standard IGH reference sequence. This preliminary work establishes the feasibility of fine-grained genotype and copy number analysis using error-prone long reads in complex multi-gene loci and opens the door for in-depth investigation into IGHV heterogeneity using accessible and increasingly common whole-genome sequence. : Biological Sciences; Bioinformatics; Computational Bioinformatics; Genomic Analysis Subject Areas: Biological Sciences, Bioinformatics, Computational Bioinformatics, Genomic Analysis
url	http://www.sciencedirect.com/science/article/pii/S2589004220300675
work_keys_str_mv	AT michaelford genotypingandcopynumberanalysisofimmunoglobinheavychainvariablegenesusinglongreads AT ehsanhaghshenas genotypingandcopynumberanalysisofimmunoglobinheavychainvariablegenesusinglongreads AT coreytwatson genotypingandcopynumberanalysisofimmunoglobinheavychainvariablegenesusinglongreads AT scenksahinalp genotypingandcopynumberanalysisofimmunoglobinheavychainvariablegenesusinglongreads
_version_	1724779971561390080

Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads

Similar Items