ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data

Abstract Repeat expansions are responsible for over 40 monogenic disorders, and undoubtedly more pathogenic repeat expansions remain to be discovered. Existing methods for detecting repeat expansions in short-read sequencing data require predefined repeat catalogs. Recent discoveries emphasize the n...

Full description

Bibliographic Details
Main Authors: Egor Dolzhenko, Mark F. Bennett, Phillip A. Richmond, Brett Trost, Sai Chen, Joke J. F. A. van Vugt, Charlotte Nguyen, Giuseppe Narzisi, Vladimir G. Gainullin, Andrew M. Gross, Bryan R. Lajoie, Ryan J. Taft, Wyeth W. Wasserman, Stephen W. Scherer, Jan H. Veldink, David R. Bentley, Ryan K. C. Yuen, Melanie Bahlo, Michael A. Eberle
Format: Article
Language:English
Published: BMC 2020-04-01
Series:Genome Biology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13059-020-02017-z
id doaj-e6a0e5f3dd2f430489a30306e1ab55af
record_format Article
spelling doaj-e6a0e5f3dd2f430489a30306e1ab55af2020-11-25T02:54:23ZengBMCGenome Biology1474-760X2020-04-0121111410.1186/s13059-020-02017-zExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing dataEgor Dolzhenko0Mark F. Bennett1Phillip A. Richmond2Brett Trost3Sai Chen4Joke J. F. A. van Vugt5Charlotte Nguyen6Giuseppe Narzisi7Vladimir G. Gainullin8Andrew M. Gross9Bryan R. Lajoie10Ryan J. Taft11Wyeth W. Wasserman12Stephen W. Scherer13Jan H. Veldink14David R. Bentley15Ryan K. C. Yuen16Melanie Bahlo17Michael A. Eberle18Illumina Inc.Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical ResearchCentre for Molecular Medicine and Therapeutics, BC Children’s Hospital, University of British ColumbiaGenetics and Genome Biology, The Hospital for Sick ChildrenIllumina Inc.Department of Neurology, UMC Utrecht Brain Center, Utrecht UniversityGenetics and Genome Biology, The Hospital for Sick ChildrenNew York Genome CenterIllumina Inc.Illumina Inc.Illumina Inc.Illumina Inc.Centre for Molecular Medicine and Therapeutics, BC Children’s Hospital, University of British ColumbiaGenetics and Genome Biology, The Hospital for Sick ChildrenDepartment of Neurology, UMC Utrecht Brain Center, Utrecht UniversityIllumina Cambridge Ltd, Illumina CentreGenetics and Genome Biology, The Hospital for Sick ChildrenPopulation Health and Immunity Division, The Walter and Eliza Hall Institute of Medical ResearchIllumina Inc.Abstract Repeat expansions are responsible for over 40 monogenic disorders, and undoubtedly more pathogenic repeat expansions remain to be discovered. Existing methods for detecting repeat expansions in short-read sequencing data require predefined repeat catalogs. Recent discoveries emphasize the need for methods that do not require pre-specified candidate repeats. To address this need, we introduce ExpansionHunter Denovo, an efficient catalog-free method for genome-wide repeat expansion detection. Analysis of real and simulated data shows that our method can identify large expansions of 41 out of 44 pathogenic repeats, including nine recently reported non-reference repeat expansions not discoverable via existing methods.http://link.springer.com/article/10.1186/s13059-020-02017-zRepeat expansionsShort tandem repeatsWhole-genome sequencing dataGenome-wide analysisFriedreich ataxiaMyotonic dystrophy type 1
collection DOAJ
language English
format Article
sources DOAJ
author Egor Dolzhenko
Mark F. Bennett
Phillip A. Richmond
Brett Trost
Sai Chen
Joke J. F. A. van Vugt
Charlotte Nguyen
Giuseppe Narzisi
Vladimir G. Gainullin
Andrew M. Gross
Bryan R. Lajoie
Ryan J. Taft
Wyeth W. Wasserman
Stephen W. Scherer
Jan H. Veldink
David R. Bentley
Ryan K. C. Yuen
Melanie Bahlo
Michael A. Eberle
spellingShingle Egor Dolzhenko
Mark F. Bennett
Phillip A. Richmond
Brett Trost
Sai Chen
Joke J. F. A. van Vugt
Charlotte Nguyen
Giuseppe Narzisi
Vladimir G. Gainullin
Andrew M. Gross
Bryan R. Lajoie
Ryan J. Taft
Wyeth W. Wasserman
Stephen W. Scherer
Jan H. Veldink
David R. Bentley
Ryan K. C. Yuen
Melanie Bahlo
Michael A. Eberle
ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data
Genome Biology
Repeat expansions
Short tandem repeats
Whole-genome sequencing data
Genome-wide analysis
Friedreich ataxia
Myotonic dystrophy type 1
author_facet Egor Dolzhenko
Mark F. Bennett
Phillip A. Richmond
Brett Trost
Sai Chen
Joke J. F. A. van Vugt
Charlotte Nguyen
Giuseppe Narzisi
Vladimir G. Gainullin
Andrew M. Gross
Bryan R. Lajoie
Ryan J. Taft
Wyeth W. Wasserman
Stephen W. Scherer
Jan H. Veldink
David R. Bentley
Ryan K. C. Yuen
Melanie Bahlo
Michael A. Eberle
author_sort Egor Dolzhenko
title ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data
title_short ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data
title_full ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data
title_fullStr ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data
title_full_unstemmed ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data
title_sort expansionhunter denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data
publisher BMC
series Genome Biology
issn 1474-760X
publishDate 2020-04-01
description Abstract Repeat expansions are responsible for over 40 monogenic disorders, and undoubtedly more pathogenic repeat expansions remain to be discovered. Existing methods for detecting repeat expansions in short-read sequencing data require predefined repeat catalogs. Recent discoveries emphasize the need for methods that do not require pre-specified candidate repeats. To address this need, we introduce ExpansionHunter Denovo, an efficient catalog-free method for genome-wide repeat expansion detection. Analysis of real and simulated data shows that our method can identify large expansions of 41 out of 44 pathogenic repeats, including nine recently reported non-reference repeat expansions not discoverable via existing methods.
topic Repeat expansions
Short tandem repeats
Whole-genome sequencing data
Genome-wide analysis
Friedreich ataxia
Myotonic dystrophy type 1
url http://link.springer.com/article/10.1186/s13059-020-02017-z
work_keys_str_mv AT egordolzhenko expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT markfbennett expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT philliparichmond expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT bretttrost expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT saichen expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT jokejfavanvugt expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT charlottenguyen expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT giuseppenarzisi expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT vladimirggainullin expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT andrewmgross expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT bryanrlajoie expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT ryanjtaft expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT wyethwwasserman expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT stephenwscherer expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT janhveldink expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT davidrbentley expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT ryankcyuen expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT melaniebahlo expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
AT michaelaeberle expansionhunterdenovoacomputationalmethodforlocatingknownandnovelrepeatexpansionsinshortreadsequencingdata
_version_ 1724721586475368448