Welcome to RETROFIT, a Bioconductor package for reference-free learning of cell-type composition and cell-type-specific gene expression in spatial transcriptomics (ST).
If you find this R package or any part of this repository useful for your work, please kindly cite the following research article:
Roopali Singh, Xi He, Adam Keebum Park, Ross Cameron Hardison, Xiang Zhu, Qunhua Li. RETROFIT: Reference-free deconvolution of cell-type mixtures in spatial transcriptomics. bioRxiv (2023) https://doi.org/10.1101/2023.06.07.544126.
Correspondence should be addressed to X.Z. (xiangzhu[at]psu.edu) and Q.L. (qunhua.li[at]psu.edu).
ST profiles gene expression in intact tissues. However, ST data measured at each spatial location may represent gene expression of multiple cell types, making it difficult to identify cell-type-specific transcriptional variation across spatial contexts. Existing cell-type deconvolutions of ST data often require single-cell transcriptomic references, which can be limited by availability, completeness and platform effect of such references. We present RETROFIT, a reference-free Bayesian method that produces sparse and interpretable solutions to deconvolve cell types underlying each location independent of single-cell transcriptomic references.
Built on a Bayesian hierarchical model, RETROFIT deconvolves the ST data matrix (
The following figure shows the method schematic of RETROFIT. First, RETROFIT takes a ST data matrix as the only input and decomposes this matrix into latent components in an unsupervised manner, without using any single-cell transcriptomic information. Second, RETROFIT matches these latent components to known cell types using either a cell-type-specific gene expression reference or a list of cell-type-specific marker genes for the cell types present in the ST sample. Lastly, RETROFIT outputs a cell-type-specific gene expression matrix and a cell-type proportion matrix.
To install retrofit from Bioconductor,
please start R (version "4.3") and enter:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("retrofit")Alternatively, please follow these steps to install retrofit from GitHub:
install.packages("devtools")
devtools::install_github("qunhualilab/retrofit")Please follow the code below to run RETROFIT on your ST data.
library(retrofit)
## load the built-in demo data
utils::data(testSimulationData)
x <- testSimulationData$extra5_x
sc_ref <- testSimulationData$sc_ref
marker_ref <- testSimulationData$marker_ref
## decompose the ST data matrix into latent components
res <- retrofit::decompose(x, L=16, iterations=100, verbose=TRUE)
W <- res$w
H <- res$h
TH <- res$th
## match the latent components to known cell types using a cell-type-specific gene expression reference
res <- retrofit::annotateWithCorrelations(sc_ref, K=8, decomp_w=W, decomp_h=H)
W_annotated <- res$w
H_annotated <- res$h
cells <- res$ranked_cells
## match the latent components to known cell types using a list of cell-type-specific marker genes
res <- retrofit::annotateWithMarkers(marker_ref, K=8, decomp_w=W, decomp_h=H)
W_annotated <- res$w
H_annotated <- res$h
cells <- res$ranked_cells Here we provide two vignettes illustrating how use the RETROFIT R package.
The first and simpler vignette (RMD file; HTML file) aims to help users get started with RETROFIT and understand its basic usage on a simulated ST dataset. Running the codes in this vignette will help users get an overall picture of what RETROFIT can do.
The second and slightly more advanced vignette (RMD file; HTML file) aims to showcase how to use RETROFIT in a real-world ST study. This vignette utilizes ST data from a human fetal intestine sample, generated on the 10x Genomics Visium Spatial Gene Expression by Fawkner-Corbett et al (Cell 2021).
