High-throughput whole-genome sequencing of E14 mouse embryonic stem cells

Mouse E14 embryonic stem cells (ESCs) are the most used ESC line, often employed for genome-wide studies involving next generation sequencing analysis [1–5]. More than 2 × 10 E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were us...

Full description

Bibliographic Details
Main Authors: Danny Incarnato, Francesco Neri
Format: Article
Language:English
Published: Elsevier 2015-03-01
Series:Genomics Data
Subjects:
ESC
NGS
Online Access:http://www.sciencedirect.com/science/article/pii/S2213596014001123
id doaj-d0027d1964524834ba8cdb4b4ef90224
record_format Article
spelling doaj-d0027d1964524834ba8cdb4b4ef902242020-11-25T02:56:28ZengElsevierGenomics Data2213-59602015-03-013C6710.1016/j.gdata.2014.10.023High-throughput whole-genome sequencing of E14 mouse embryonic stem cellsDanny IncarnatoFrancesco NeriMouse E14 embryonic stem cells (ESCs) are the most used ESC line, often employed for genome-wide studies involving next generation sequencing analysis [1–5]. More than 2 × 10 E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7 × 10 E6 single nucleotide variant [6]. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variants are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of these cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because it could be a target of DNA methylation. Data were deposited in GEO datasets under reference GSM1283021 and here: http://epigenetics.hugef-research.org/data.php.http://www.sciencedirect.com/science/article/pii/S2213596014001123ESCNGSWhole-genome E14
collection DOAJ
language English
format Article
sources DOAJ
author Danny Incarnato
Francesco Neri
spellingShingle Danny Incarnato
Francesco Neri
High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
Genomics Data
ESC
NGS
Whole-genome E14
author_facet Danny Incarnato
Francesco Neri
author_sort Danny Incarnato
title High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_short High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_full High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_fullStr High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_full_unstemmed High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_sort high-throughput whole-genome sequencing of e14 mouse embryonic stem cells
publisher Elsevier
series Genomics Data
issn 2213-5960
publishDate 2015-03-01
description Mouse E14 embryonic stem cells (ESCs) are the most used ESC line, often employed for genome-wide studies involving next generation sequencing analysis [1–5]. More than 2 × 10 E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7 × 10 E6 single nucleotide variant [6]. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variants are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of these cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because it could be a target of DNA methylation. Data were deposited in GEO datasets under reference GSM1283021 and here: http://epigenetics.hugef-research.org/data.php.
topic ESC
NGS
Whole-genome E14
url http://www.sciencedirect.com/science/article/pii/S2213596014001123
work_keys_str_mv AT dannyincarnato highthroughputwholegenomesequencingofe14mouseembryonicstemcells
AT francesconeri highthroughputwholegenomesequencingofe14mouseembryonicstemcells
_version_ 1724713923916070912