A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis

Existing cancer benchmark data sets for human sequencing data use germline variants, synthetic methods, or expensive validations, none of which are satisfactory for providing a large collection of true somatic variation across a whole genome. Here we propose a data set, Lineage derived Somatic Truth...

Full description

Bibliographic Details
Main Authors: Shand, Megan (Author), Soto, Jose (Author), Lichtenstein, Lee (Author), Benjamin, David (Author), Farjoun, Yossi (Author), Brody, Yehuda (Author), Maruvka, Yosef (Author), Blainey, Paul C. (Author), Banks, Eric (Author)
Other Authors: Massachusetts Institute of Technology. Department of Biological Engineering (Contributor), Koch Institute for Integrative Cancer Research at MIT (Contributor)
Format: Article
Language:English
Published: Springer Science and Business Media LLC, 2022-02-09T16:06:34Z.
Subjects:
Online Access:Get fulltext
LEADER 01486 am a22002653u 4500
001 133024.2
042 |a dc 
100 1 0 |a Shand, Megan  |e author 
100 1 0 |a Massachusetts Institute of Technology. Department of Biological Engineering  |e contributor 
100 1 0 |a Koch Institute for Integrative Cancer Research at MIT  |e contributor 
700 1 0 |a Soto, Jose  |e author 
700 1 0 |a Lichtenstein, Lee  |e author 
700 1 0 |a Benjamin, David  |e author 
700 1 0 |a Farjoun, Yossi  |e author 
700 1 0 |a Brody, Yehuda  |e author 
700 1 0 |a Maruvka, Yosef  |e author 
700 1 0 |a Blainey, Paul C.  |e author 
700 1 0 |a Banks, Eric  |e author 
245 0 0 |a A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis 
260 |b Springer Science and Business Media LLC,   |c 2022-02-09T16:06:34Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/133024.2 
520 |a Existing cancer benchmark data sets for human sequencing data use germline variants, synthetic methods, or expensive validations, none of which are satisfactory for providing a large collection of true somatic variation across a whole genome. Here we propose a data set, Lineage derived Somatic Truth (LinST), of short somatic mutations in the HT115 colon cancer cell-line, that are validated using a known cell lineage that includes thousands of mutations and a high confidence region covering 2.7 gigabases per sample. 
546 |a en 
655 7 |a Article 
773 |t Communications Biology