NSDPY: A python package to download DNA sequences from NCBI

Downloading large batches of DNA sequences can be useful to create custom databases containing for example sequences of a particular genomic region or a group of organisms. These sequences can be found on NCBI databases and accessed via a web browser (GUI) or directly via NCBI API. While the GUI is...

Full description

Bibliographic Details
Published in:SoftwareX
Main Authors: Raphaël Hebert, Emese Meglécz
Format: Article
Language:English
Published: Elsevier 2022-06-01
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S235271102200036X
Description
Summary:Downloading large batches of DNA sequences can be useful to create custom databases containing for example sequences of a particular genomic region or a group of organisms. These sequences can be found on NCBI databases and accessed via a web browser (GUI) or directly via NCBI API. While the GUI is user-friendly, it lacks certain functionalities. On the other extreme, the use of the API is flexible but requires coding knowledge. NSDPY is a python package that combines flexibility and ease of use to download large amount of DNA sequences and includes several taxonomic or filtering options like batch downloading sequences for a list of taxa, downloading sequences including taxonomic lineage or filtering CDS sequences for a specific gene. NSDPY is available on PyPI, it is written to minimize dependencies on other packages and to be used directly from the terminal by simple command lines so that most users can use it without prior coding experience.
ISSN:2352-7110