Summary: | <p>Abstract</p> <p>Background</p> <p>We performed large-scale bacterial artificial chromosome (BAC) end-sequencing of two BAC libraries (an <it>Eco</it>RI- and a <it>Bam</it>HI-digested library) and conducted an <it>in silico </it>analysis to characterize the obtained sequence data, to make them a useful resource for genomic research on the silkworm (<it>Bombyx mori</it>).</p> <p>Results</p> <p>More than 94000 BAC end sequences (BESs), comprising more than 55 Mbp and covering about 10.4% of the silkworm genome, were sequenced. Repeat-sequence analysis with known repeat sequences indicated that the long interspersed nuclear elements (LINEs) were abundant in <it>Bam</it>HI BESs, whereas DNA-type elements were abundant in <it>Eco</it>RI BESs. Repeat-sequence analysis revealed that the abundance of LINEs might be due to a GC bias of the restriction sites and that the GC content of silkworm LINEs was higher than that of mammalian LINEs. In a BLAST-based sequence analysis of the BESs against two available whole-genome shotgun sequence data sets, more than 70% of the BESs had a BLAST hit with an identity of ≥ 99%. About 14% of <it>Eco</it>RI BESs and about 8% of <it>Bam</it>HI BESs were paired-end clones with unique sequences at both ends. Cluster analysis of the BESs clarified the proportion of BESs containing protein-coding regions.</p> <p>Conclusion</p> <p>As a result of this characterization, the identified BESs will be a valuable resource for genomic research on <it>Bombyx mori</it>, for example, as a base for construction of a BAC-based physical map. The use of multiple complementary BAC libraries constructed with different restriction enzymes also makes the BESs a more valuable genomic resource. The GenBank accession numbers of the obtained end sequences are <ext-link ext-link-type="gen" ext-link-id="DE283657">DE283657</ext-link>–<ext-link ext-link-type="gen" ext-link-id="DE378560">DE378560</ext-link>.</p>
|