CoSTA: unsupervised convolutional neural network learning for spatial transcriptomics analysis

Background: The rise of spatial transcriptomics technologies is leading to new insights about how gene regulation happens in a spatial context. Determining which genes are expressed in similar spatial patterns can reveal gene regulatory relationships across cell types in a tissue. However, many curr...

Full description

Bibliographic Details
Main Authors: McCord, R.P (Author), Xu, Y. (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
Description
Summary:Background: The rise of spatial transcriptomics technologies is leading to new insights about how gene regulation happens in a spatial context. Determining which genes are expressed in similar spatial patterns can reveal gene regulatory relationships across cell types in a tissue. However, many current analysis methods do not take full advantage of the spatial organization of the data, instead treating pixels as independent features. Here, we present CoSTA: a novel approach to learn spatial similarities between gene expression matrices via convolutional neural network (ConvNet) clustering. Results: By analyzing simulated and previously published spatial transcriptomics data, we demonstrate that CoSTA learns spatial relationships between genes in a way that emphasizes broader spatial patterns rather than pixel-level correlation. CoSTA provides a quantitative measure of expression pattern similarity between each pair of genes rather than only classifying genes into categories. We find that CoSTA identifies narrower, but biologically relevant, sets of significantly related genes as compared to other approaches. Conclusions: The deep learning CoSTA approach provides a different angle to spatial transcriptomics analysis by focusing on the shape of expression patterns, using more information about the positions of neighboring pixels than would an overlap or pixel correlation approach. CoSTA can be applied to any spatial transcriptomics data represented in matrix form and may have future applications to datasets such as histology in which images of different genes are from similar but not identical biological sections. © 2021, The Author(s).
ISBN:14712105 (ISSN)
DOI:10.1186/s12859-021-04314-1