The rainfall plot: its motivation, characteristics and pitfalls

Abstract Background A visualization referred to as rainfall plot has recently gained popularity in genome data analysis. The plot is mostly used for illustrating the distribution of somatic cancer mutations along a reference genome, typically aiming to identify mutation hotspots. In general terms, t...

Full description

Bibliographic Details
Main Authors: Diana Domanska, Daniel Vodák, Christin Lund-Andersen, Stefania Salvatore, Eivind Hovig, Geir Kjetil Sandve
Format: Article
Language:English
Published: BMC 2017-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-017-1679-8
id doaj-2821b5624fec4663b026ddf667a8cac6
record_format Article
spelling doaj-2821b5624fec4663b026ddf667a8cac62020-11-24T21:25:19ZengBMCBMC Bioinformatics1471-21052017-05-0118111110.1186/s12859-017-1679-8The rainfall plot: its motivation, characteristics and pitfallsDiana Domanska0Daniel Vodák1Christin Lund-Andersen2Stefania Salvatore3Eivind Hovig4Geir Kjetil Sandve5Department of Informatics, University of OsloDepartment of Tumor Biology, Institute for Cancer Research, Oslo University HospitalDepartment of Tumor Biology, Institute for Cancer Research, Oslo University HospitalDepartment of Informatics, University of OsloDepartment of Informatics, University of OsloDepartment of Informatics, University of OsloAbstract Background A visualization referred to as rainfall plot has recently gained popularity in genome data analysis. The plot is mostly used for illustrating the distribution of somatic cancer mutations along a reference genome, typically aiming to identify mutation hotspots. In general terms, the rainfall plot can be seen as a scatter plot showing the location of events on the x-axis versus the distance between consecutive events on the y-axis. Despite its frequent use, the motivation for applying this particular visualization and the appropriateness of its usage have never been critically addressed in detail. Results We show that the rainfall plot allows visual detection even for events occurring at high frequency over very short distances. In addition, event clustering at multiple scales may be detected as distinct horizontal bands in rainfall plots. At the same time, due to the limited size of standard figures, rainfall plots might suffer from inability to distinguish overlapping events, especially when multiple datasets are plotted in the same figure. We demonstrate the consequences of plot congestion, which results in obscured visual data interpretations. Conclusions This work provides the first comprehensive survey of the characteristics and proper usage of rainfall plots. We find that the rainfall plot is able to convey a large amount of information without any need for parameterization or tuning. However, we also demonstrate how plot congestion and the use of a logarithmic y-axis may result in obscured visual data interpretations. To aid the productive utilization of rainfall plots, we demonstrate their characteristics and potential pitfalls using both simulated and real data, and provide a set of practical guidelines for their proper interpretation and usage.http://link.springer.com/article/10.1186/s12859-017-1679-8Rainfall plotVisualizationMutationGenomics
collection DOAJ
language English
format Article
sources DOAJ
author Diana Domanska
Daniel Vodák
Christin Lund-Andersen
Stefania Salvatore
Eivind Hovig
Geir Kjetil Sandve
spellingShingle Diana Domanska
Daniel Vodák
Christin Lund-Andersen
Stefania Salvatore
Eivind Hovig
Geir Kjetil Sandve
The rainfall plot: its motivation, characteristics and pitfalls
BMC Bioinformatics
Rainfall plot
Visualization
Mutation
Genomics
author_facet Diana Domanska
Daniel Vodák
Christin Lund-Andersen
Stefania Salvatore
Eivind Hovig
Geir Kjetil Sandve
author_sort Diana Domanska
title The rainfall plot: its motivation, characteristics and pitfalls
title_short The rainfall plot: its motivation, characteristics and pitfalls
title_full The rainfall plot: its motivation, characteristics and pitfalls
title_fullStr The rainfall plot: its motivation, characteristics and pitfalls
title_full_unstemmed The rainfall plot: its motivation, characteristics and pitfalls
title_sort rainfall plot: its motivation, characteristics and pitfalls
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2017-05-01
description Abstract Background A visualization referred to as rainfall plot has recently gained popularity in genome data analysis. The plot is mostly used for illustrating the distribution of somatic cancer mutations along a reference genome, typically aiming to identify mutation hotspots. In general terms, the rainfall plot can be seen as a scatter plot showing the location of events on the x-axis versus the distance between consecutive events on the y-axis. Despite its frequent use, the motivation for applying this particular visualization and the appropriateness of its usage have never been critically addressed in detail. Results We show that the rainfall plot allows visual detection even for events occurring at high frequency over very short distances. In addition, event clustering at multiple scales may be detected as distinct horizontal bands in rainfall plots. At the same time, due to the limited size of standard figures, rainfall plots might suffer from inability to distinguish overlapping events, especially when multiple datasets are plotted in the same figure. We demonstrate the consequences of plot congestion, which results in obscured visual data interpretations. Conclusions This work provides the first comprehensive survey of the characteristics and proper usage of rainfall plots. We find that the rainfall plot is able to convey a large amount of information without any need for parameterization or tuning. However, we also demonstrate how plot congestion and the use of a logarithmic y-axis may result in obscured visual data interpretations. To aid the productive utilization of rainfall plots, we demonstrate their characteristics and potential pitfalls using both simulated and real data, and provide a set of practical guidelines for their proper interpretation and usage.
topic Rainfall plot
Visualization
Mutation
Genomics
url http://link.springer.com/article/10.1186/s12859-017-1679-8
work_keys_str_mv AT dianadomanska therainfallplotitsmotivationcharacteristicsandpitfalls
AT danielvodak therainfallplotitsmotivationcharacteristicsandpitfalls
AT christinlundandersen therainfallplotitsmotivationcharacteristicsandpitfalls
AT stefaniasalvatore therainfallplotitsmotivationcharacteristicsandpitfalls
AT eivindhovig therainfallplotitsmotivationcharacteristicsandpitfalls
AT geirkjetilsandve therainfallplotitsmotivationcharacteristicsandpitfalls
AT dianadomanska rainfallplotitsmotivationcharacteristicsandpitfalls
AT danielvodak rainfallplotitsmotivationcharacteristicsandpitfalls
AT christinlundandersen rainfallplotitsmotivationcharacteristicsandpitfalls
AT stefaniasalvatore rainfallplotitsmotivationcharacteristicsandpitfalls
AT eivindhovig rainfallplotitsmotivationcharacteristicsandpitfalls
AT geirkjetilsandve rainfallplotitsmotivationcharacteristicsandpitfalls
_version_ 1725983501955629056