SparkRA: Enabling Big Data Scalability for the GATK RNA-seq Pipeline with Apache Spark
The rapid proliferation of low-cost RNA-seq data has resulted in a growing interest in RNA analysis techniques for various applications, ranging from identifying genotype−phenotype relationships to validating discoveries of other analysis results. However, many practical applications in th...
Main Authors: | Zaid Al-Ars, Saiyi Wang, Hamid Mushtaq |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-01-01
|
Series: | Genes |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4425/11/1/53 |
Similar Items
-
Optimizing performance of GATK workflows using Apache Arrow In-Memory data framework
by: Tanveer Ahmad, et al.
Published: (2020-11-01) -
GeoSparkSim: A Scalable Microscopic Road Network Traffic Simulator Based on Apache Spark
Published: (2019) -
Implementing Apache Spark jobs execution and Apache Spark cluster creation for Openstack Sahara[1]
by: A. . Aleksiyants, et al.
Published: (2018-10-01) -
Mining Formal Concepts in Large Binary Datasets using Apache Spark
by: Rayabarapu, Varun Raj
Published: (2021) -
Recommendations for performance optimizations when using GATK3.8 and GATK4
by: Jacob R Heldenbrand, et al.
Published: (2019-11-01)