Mercator: A pipeline for multi-method, unsupervised visualization and distance generation

Unsupervised machine learning provides tools for researchers to uncover latent patterns in large-scale data, based on calculated distances between observations. Methods to visualize high-dimensional data based on these distances can elucidate subtypes and interactions within multi-dimensional and hi...

Full description

Bibliographic Details
Main Authors: Abrams, Z.B (Author), Coombes, C.E (Author), Coombes, K.R (Author), Li, S. (Author)
Format: Article
Language:English
Published: Oxford University Press 2021
Subjects:
Online Access:View Fulltext in Publisher
Description
Summary:Unsupervised machine learning provides tools for researchers to uncover latent patterns in large-scale data, based on calculated distances between observations. Methods to visualize high-dimensional data based on these distances can elucidate subtypes and interactions within multi-dimensional and high-throughput data. However, researchers can select from a vast number of distance metrics and visualizations, each with their own strengths and weaknesses. The Mercator R package facilitates selection of a biologically meaningful distance from 10 metrics, together appropriate for binary, categorical and continuous data, and visualization with 5 standard and high-dimensional graphics tools. Mercator provides a user-friendly pipeline for informaticians or biologists to perform unsupervised analyses, from exploratory pattern recognition to production of publication-quality graphics. © 2021 The Author(s). Published by Oxford University Press.
ISBN:13674803 (ISSN)
DOI:10.1093/bioinformatics/btab037