0% found this document useful (0 votes)
89 views33 pages

Deciphering Spatial Domains From Spatial Multi-Omics With Spatialglue

paper om spatial glue

Uploaded by

tanskkaleli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views33 pages

Deciphering Spatial Domains From Spatial Multi-Omics With Spatialglue

paper om spatial glue

Uploaded by

tanskkaleli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

nature methods

Article https://doi.org/10.1038/s41592-024-02316-4

Deciphering spatial domains from spatial


multi-omics with SpatialGlue

Received: 10 May 2023 Yahui Long 1, Kok Siong Ang 1, Raman Sethi 2, Sha Liao3,4, Yang Heng3,4,
Lynn van Olst 5, Shuchen Ye2, Chengwei Zhong1, Hang Xu 2, Di Zhang6,
Accepted: 20 May 2024
Immanuel Kwok7, Nazihah Husna7,8,9, Min Jian3,10, Lai Guan Ng 11, Ao Chen 3,4,12,
Published online: 21 June 2024 Nicholas R. J. Gascoigne 8,9,13, David Gate5, Rong Fan 6, Xun Xu 3 &
Jinmiao Chen 1,2,8,9,14
Check for updates

Advances in spatial omics technologies now allow multiple types of data to


be acquired from the same tissue slice. To realize the full potential of such
data, we need spatially informed methods for data integration. Here, we
introduce SpatialGlue, a graph neural network model with a dual-attention
mechanism that deciphers spatial domains by intra-omics integration
of spatial location and omics measurement followed by cross-omics
integration. We demonstrated SpatialGlue on data acquired from different
tissue types using different technologies, including spatial epigenome–
transcriptome and transcriptome–proteome modalities. Compared to
other methods, SpatialGlue captured more anatomical details and more
accurately resolved spatial domains such as the cortex layers of the brain.
Our method also identified cell types like spleen macrophage subsets
located at three different zones that were not available in the original data
annotations. SpatialGlue scales well with data size and can be used to
integrate three modalities. Our spatial multi-omics analysis tool combines
the information from complementary omics modalities to obtain a holistic
view of cellular and tissue properties.

Spatial transcriptomics is the next major development in analyzing To fully utilize spatial multi-omics data to construct a coherent
biological samples since the advent of single-cell transcriptomics. Cur- picture of the tissue under study, spatially aware integration of hetero-
rently, spatial technologies are expanding to spatial multi-omics with geneous data modalities is required. Such multi-omics data integration
the simultaneous profiling of different omics on a single tissue section. poses a major challenge as different modalities have feature counts
These technologies can be roughly divided into two categories, sequenc- that can vary enormously (for example, number of proteins versus
ing based and imaging based. Sequencing-based techniques include transcripts measured) and possess different statistical distributions.
DBiT-seq1, spatial-CITE-seq2, spatial assay for transposase-accessible This challenge is deepened when integrating spatial information with
chromatin and RNA using sequencing (spatial ATAC–RNA-seq) and feature counts within each data modality. To our knowledge, there is
CUT&Tag-RNA-seq3, SPOTS4, SM-Omics5, Stereo-CITE-seq6, spatial no tool designed specifically for spatial multi-omics acquired from the
RNA-TCR-seq7 and 10x Genomics Xenium8, while imaging-based tech- same tissue section. Existing methods either are unimodal or do not use
niques include DNA seqFISH+9, DNA-MERFISH-based DNA and RNA spatial information, except for one tool with functionality for spatial
profiling10, MERSCOPE11 and Nanostring CosMx12. With these technolo- multi-omics integration, MEFISTO, which has only been previously
gies, we can now acquire multiple complementary views of each cell demonstrated on single-cell multi-omics or spatial transcriptomics
within their spatial context. This offers the potential for developing separately. For the non-spatial multi-omics data integration methods,
deeper insights into cellular and emergent tissue properties. a wide range of algorithms are available. These include Seurat WNN13,

A full list of affiliations appears at the end of the paper. e-mail: [email protected]

Nature Methods | Volume 21 | September 2024 | 1658–1667 1658


Article https://doi.org/10.1038/s41592-024-02316-4

MOFA+14, StabMap15, totalVI16, MultiVI17 and scMM18. Moreover, some representations and derive modality-specific representations. Specifi-
of these methods are designed for specific data modalities, which cally, the model learns graph-specific weights to assign importance to
can be restrictive. For example, totalVI is designed for CITE-seq data each graph. Similarly, the different omics modalities can have distinct
of RNA and protein modalities, while MultiVI is optimized for gene and complementary contributions to each spot. Thus, we further
expression and chromatin accessibility. For spatial omics tools, exam- designed a between-modality attention aggregation layer that learns
ples include STAGATE19, SpaGCN20 and GraphST21, which integrate modality-specific importance weights and adaptively integrates the
spatial information and single omics modalities. Such single omics modality-specific representations to generate the final cross-modality
methods can only handle spatial multi-omics data by concatenating integrated latent representation. The learned weights illustrate the con-
the feature count data from different omics modalities. This approach tribution of each modality to the learned latent representation of each
assumes that features across different omics have the same impor- spot and, consequently, the demarcation of different spatial domains
tance, which may not be true. Therefore, tools specifically tailored for or cell types. After obtaining SpatialGlue’s integrated multi-omics rep-
spatial multi-omics data are needed to handle the challenges of inte- resentation, we can then use clustering to identify biologically relevant
grating spatial multi-omics data for downstream analyses. In particular, spatial domains, which consist of cells that are coherent spatially and
we need new methods that are capable of spatially aware cross-omics across the measured omics. Such spatial domains can range from local
integration. clusters of distinct cell states to functionally distinct anatomical struc-
Here we present SpatialGlue for spatial multi-omics analysis. tures. We believe this attention-based approach enables more accurate
Specifically, SpatialGlue is a spatially aware method that integrates integration than summation or concatenation of the feature matrices.
multiple spatial omics data modalities, acquired from the same tissue To evaluate the effectiveness of the proposed SpatialGlue model,
slice, to decipher spatial domains of tissue samples at a higher spa- we initially validated the importance of attention and other compo-
tial resolution. SpatialGlue uses graph neural networks to learn a nents with a series of ablation studies using simulated data (Supple-
low-dimensional embedding for each data modality, followed by mentary Fig. 1; full experimental details are in the Supplementary
data integration across modalities. To integrate spatial information Information). Subsequently, we characterized SpatialGlue’s sensitivity
with individual omics data and integrate across omics, we adopted to the number of neighbors and principal component analysis (PCA)
a dual-attention aggregation mechanism to adaptively capture the dimensions of the input, and the number of GNN layers (Extended Data
importance of different modalities, resulting in more accurate inte- Fig. 1; full results are in the Supplementary Information).
gration. We first tested SpatialGlue on simulated and experimentally
acquired data of the human lymph node with ground truth to bench- Benchmarking SpatialGlue and existing methods on
mark its performance with other methods. SpatialGlue achieved better simulated and experimental spatial multi-omics data
quantitative performance than the other methods and captured more We first benchmarked SpatialGlue with competing methods using simu-
anatomical details. We then tested SpatialGlue and competing methods lated data and experimentally acquired data with ground truth labels.
on integrating the spatial epigenome and transcriptome of the mouse With the ground truth available, we could assess performance with super-
brain, and spatial transcriptome and proteome data acquired from the vised metrics, namely homogeneity, mutual information, v measure, AMI
mouse thymus and spleen. SpatialGlue leveraged the epigenome–tran- (adjusted mutual information), NMI (normalized mutual information)
scriptome data to differentiate more cortex layers than the original and ARI (adjusted rand index). We generated a set of simulated data
data annotation, and the transcriptome–proteome to distinguish consisting of two modalities that contained unique and complementary
macrophage subsets within the spleen. These results highlight the information of the ground truth (Fig. 2a). Specifically, factors 1, 3 and
power of multimodal spatial omics for analyzing biological complexity. 4 were determined by modality 1, while factor 2 was uniquely identi-
fied through modality 2. The modalities were designed to simulate the
Results transcriptome and proteome, respectively, with the first modality fol-
SpatialGlue model structure lowing the zero-inflated negative binomial distribution and the second
SpatialGlue deciphers spatial domains of tissue samples at a higher following the negative binomial distribution (Fig. 2b). For comparison,
resolution by effectively integrating multi-omics modalities data with we tested seven competing methods: Seurat, totalVI, MultiVI, MOFA+,
spatial information (Fig. 1a). SpatialGlue is a graph neural network MEFISTO, scMM and StabMap, alongside SpatialGlue. Visually, Spatial-
(GNN)-based deep learning model (Fig. 1b). The input data to Spa- Glue was able to clearly recover all four spatial factors to closely match
tialGlue can be feature matrices of segmented cells or capture loca- the ground truth (Fig. 2a). Seurat and MEFISTO were able to clearly
tions (beads, voxels, pixels, bins or spots), with accompanying spatial recover two factors (factors 2 and 4 for Seurat, 3 and 4 for MEFISTO).
coordinates. We refer to the cells and the capture locations as spots Other methods were able to recover some of the factors but with much
hereafter for brevity and not to restrict SpatialGlue to any specific higher levels of noise (factor 2 for totalVI, 1 and 2 for MOFA+, MultiVI
technological platform or resolution. For data integration, Spatial- and scMM, 2 and 3 for StabMap). The metrics confirmed the visuals with
Glue uses a dual-attention mechanism at two levels, within-modality SpatialGlue scoring top in all metrics, followed by Seurat and MEFISTO
spatial information and measurement feature integration first, and (Fig. 2c). We further tested all methods with four more datasets gener-
then between-modality multi-omics integration. ated with modified distribution parameters (Supplementary Figs. 2–4)
SpatialGlue first learns a low-dimensional embedding within and measured their performance with the same metrics, summarizing
each modality using the spatial coordinates and omics data. Within the results with box plots (Fig. 2d). Again, SpatialGlue performed the
each modality, SpatialGlue constructs a spatial proximity graph and best with little variance between different datasets. Lastly, we have also
a feature similarity graph, which are used separately to encode the demonstrated using simulated data that the SpatialGlue framework is
preprocessed feature count data into a common low-dimensional extensible to three or more modalities (Extended Data Fig. 2).
embedding space. Here the spatial proximity graph captures spatial For the second example, we benchmarked SpatialGlue and the
relationships between spots, while the feature graph captures fea- same competing methods with an in-house human lymph node dataset
ture similarities. These constructed graphs possess unique semantic generated using 10x Genomics Visium RNA and protein co-profiling
information that can be integrated to better capture cellular hetero- technology (section A1). Here we used the hematoxylin and eosin
geneity. However, the different graphs can contribute differential (H&E)-based annotation as the ground truth (Fig. 2e). In the annota-
importance to each spot, posing a challenge to capture this differ- tion, the major structures include the pericapsular adipose tissue and
ence. Therefore, we adopted a within-modality attention aggregation capsule that form the outer layers of the bulb, while the cortex and
layer to adaptively integrate the spatial and feature graph-specific medulla (sinuses, cords and vessels) form the core internal structures.

Nature Methods | Volume 21 | September 2024 | 1658–1667 1659


Article https://doi.org/10.1038/s41592-024-02316-4

a
Spatial transcriptome

Single tissue slice


Spatial epigenome
Domain 1
SpatialGlue Domain 2
Domain 3
...
Spatially informed integration
of spatial multimodal data
Spatial proteome

y
Fine-grained spatial domains
x
Spatial coordinates

b Encoding within Within-modality attention Between-modality attention


Data preprocessing Decoding and training
individual modality aggregation layer aggregation layer

Feature graph GCN encoder for preserving


spatial patterns H
f1

Modality 1 KNN …
Y1 GCN decoder
Features Attention
… Loss
0
Spots

1
Shared Hs1 1
corr

… Y1
Cross-modality
integrated
Single tissue slice
representation …
Z
Spatial graph Attention 1
recon
Graph-specific Modality-specific 0
representation representation
1
Spatial coordinates Y2
GCN encoder Hs2 …
2
… recon

Modality 2 Attention
Features
0 Fine-grained clusters Y2
Shared Hf2 1

Spots

2
… corr

KNN
Feature graph

Fig. 1 | Interpretable deep dual-attention model for spatial multi-omics the neighbor graph to learn two graph-specific representations by iteratively
data analysis. a, SpatialGlue overview. SpatialGlue is designed to integrate the aggregating representations of neighbors. To capture the importance of
different omics modalities with spatial information to obtain a comprehensive different graphs, we designed a within-modality attention aggregation layer
molecular view of the tissue under study. b, SpatialGlue model structure. to adaptively integrate graph-specific representations and obtain a modality-
SpatialGlue first uses the k-nearest-neighbor (KNN) algorithm to construct a specific representation. Finally, to preserve the importance of different
spatial neighbor graph using the spatial coordinates and a feature neighbor modalities, SpatialGlue uses a between-modality attention aggregation layer
graph with the normalized expression data for each omics modality. Then to adaptively integrate modality-specific representations and output the final
for each modality, a GNN encoder takes in the normalized expressions and integrated representation of spots.

For comparison, we also plotted the single modality PCA-based cluster- while MultiVI, scMM, MEFISTO and StabMap could not. The hilum,
ing of RNA and protein (Fig. 2f). All the methods were able to isolate the which normally accumulates fat, is only visible in the RNA modality, and
paracortex (T cell zone, SpatialGlue cluster 1) that more resembled the only MOFA+ and SpatialGlue could separate it from the pericapsular
RNA-specific and protein-specific modalities than the H&E annotation, layer. To assess performance quantitatively, we used both unsuper-
which is unsurprising because T cells can be better identified by protein vised and supervised metrics. We first used the unsupervised Moran’s
and gene markers such CD8A, CD3E and CCR7 (Supplementary Fig. 5). I score and Jaccard similarity coefficient to assess spatial autocorrela-
The methods were also unable to differentiate capsule layers from the tion of clusters and preservation of distance in the joint latent space,
pericapsular adipose tissue, which were not well captured in the RNA respectively. The Moran’s I score was computed for each cluster and
and protein modalities either. Among the tested methods, SpatialGlue, visualized as a box plot for each method. SpatialGlue outperformed
Seurat, totalVI and MOFA+ were able to identify the follicle regions, all other methods with a median score of 0.62 (Fig. 2g). We computed

Nature Methods | Volume 21 | September 2024 | 1658–1667 1660


Article https://doi.org/10.1038/s41592-024-02316-4

a b
Ground truth Modality 1 Seurat totalVI MultiVI MOFA+

0.03 Modality 1
Modality 2

0.02

Density
Modality 2 MEFISTO scMM StabMap SpatialGlue
0.01
Factor 1
Factor 2
Factor 3 0
Factor 4 –100 0 100 200 300 400
Backgr Expression

c d 1.50
1.5
Seurat 1.25 Seurat
totalVI totalVI
1.00
1.0 MultiVI MultiVI

Value
MOFA+ 0.75 MOFA+
Value

MEFISTO MEFISTO
0.50
0.5 scMM scMM
StabMap 0.25 StabMap
SpatialGlue 0 SpatialGlue
0
o

re

re
ty

ty
I

I
AM

AR

AM

AR
nf

nf
su

su
ei

ei
N

N
l_i

l_i
en

en
ea

ea
ua

ua
og

og
m

m
ut

ut
v_

v_
om

om
M

M
H

f RNA Seurat TotalVI MultiVI MOFA+

e
Ground truth
Cortex
Medulla sinuses 1
Follicle 2
Medulla cords 3
Spatial 1

Pericapsular 4
Protein MEFISTO scMM StabMap SpatialGlue
adipose tissue 5
Hilum 6
Capsule
Medulla vessels
Subcapsular sinus
Trabeculae
Spatial 2

g h i Seurat MultiVI MEFISTO


RNA-joint 0.7
0.20 Protein-joint
totalVI MOFA+ scMM
0.8 0.6 StabMap
Jaccard similarity

0.15 0.5 SpatialGlue


Moran’s I score

0.6
Value

0.4
0.10
0.4 0.3

0.05 0.2
0.2
0.1
0 0
o

re
ia ap
EF +

e
ia p
e

Sp tab FA+

ty
to rat

s O

M TO

I
I
M lVI

Sp ab M

M lVI

EF M
M tiVI

M cM I

AR
M FA

AM
nf
s iV

lu
lu
at Ma

su
T

ei
St cM

at M

N
u

lG
t
lG
ta

ta
IS

l_i
IS
O

S O

en
ul

ul

ea
Se

to

ua
og

m
ut

v_
om

M
H

Fig. 2 | SpatialGlue accurately identified spatial domains in simulated and (left), clustering results (right) from single and spatial multi-omics integration
real data. a, Spatial plots of the simulated data, from left to right: ground truth, methods—Seurat, totalVI, MultiVI, MOFA+, MEFISTO, scMM, StabMap and
generated raw data of individual modalities, and clustering results by single-cell SpatialGlue. Note that the colors of clusters do not directly correspond to the
and spatial multi-omics integration methods—Seurat, totalVI, MultiVI, MOFA+, same captured structures across different methods. g, Box plots of Moran’s I
MEFISTO, scMM, StabMap and SpatialGlue. ‘backgr’ means background. score of the eight methods. h, Jaccard similarity scores of the eight methods. In
b, Density distribution of the simulated data modalities. c, Quantitative the box plot, the center line denotes the median, box limits denote the upper and
evaluation of methods with six supervised metrics. d, Box plots of the six lower quartiles, and whiskers denote the 1.5 times the interquartile range. n = 6
metrics with the scores from five simulated datasets. In the box plot, the center clusters. i, Box plots of six supervised metrics with scores of clustering results
line denotes the median, box limits denote the upper and lower quartiles, and with the number of clusters ranging from 4 to 11. In the box plot, the center
whiskers denote the 1.5 times the interquartile range. n = 5 simulated samples. line denotes the median, box limits denote the upper and lower quartiles, and
e, Manual annotation of human lymph node sample A1. f, Spatial plots of whiskers denote 1.5 times the interquartile range. n = 12 clustering results with
lymph node sample A1, clustering of individual RNA and protein modalities different resolutions for each method.

Nature Methods | Volume 21 | September 2024 | 1658–1667 1661


Article https://doi.org/10.1038/s41592-024-02316-4

a Jaccard similarity score to quantify the overlap of neighbor sets output (Fig. 3e and Extended Data Fig. 4c). For the cross-modality
between the joint space and each modality. Summed together, the weights, the RNA modality better segregated the ccg region and thus
total Jaccard similarity of SpatialGlue also outperformed the other was assigned a heavier weight. On the other hand, for the ctx and vl, the
methods with MOFA+ as a close second (Fig. 2h). For the supervised ATAC modality showed more contribution and thus a heavier weight was
metrics computed with respect to the ground truth, SpatialGlue like- assigned. It should be noted that the cross-modality weights are calcu-
wise outperformed all other methods with six clusters (Extended lated based on the latent embeddings of each modality, rather than raw
Data Fig. 3c). We further generated different numbers of clusters and feature matrices. Therefore, observing some degree of discrepancies
the resulting box plots of supervised metrics showed results stability between the cross-modality weights and unimodal clusters is expected.
regardless of clustering resolution (Fig. 2i). To ensure that the results We extended the analysis to another P22 mouse brain dataset of a
were not predicated on a specific tissue section, we applied the same highly similar coronal section but with RNA-seq and CUT&Tag (acety-
methods to another human lymph node section (D1). With this data, lated histone H3 Lys27 (H3K27ac) histone modification) modalities.
SpatialGlue showed comparable scores with six clusters, but achieved This dataset also does not have an annotated ground truth; therefore,
more stable performance across different clustering resolutions than we used the Allen brain atlas reference again for annotation. In this
other methods (Extended Data Fig. 3d–i). dataset, SpatialGlue captured the major structures of the ctx layers
(clusters 1, 2, 5, 6, 11 and 12), 8-aca, 10-ccg/aco, cp (7,14), vl (9,16), 3-ls
Capturing mouse brain anatomy from spatial epigenome– and 4-acb (Fig. 3f). By contrast, all other methods were unable to clearly
transcriptome data at higher resolution than with individual capture many structures such as the acb and ls. The output of Spatial-
modalities Glue also had the least noise, which was reflected in Moran’s I score
Next, we applied SpatialGlue to mouse brain epigenome–transcrip- (Fig. 3g). For the Jaccard similarity, SpatialGlue achieved the highest
tome datasets to showcase its ability to resolve spatial domains at a score, highlighting that SpatialGlue’s integrated output was able to best
higher resolution than methods used in the original study. We first preserve the between-spot distance from the original individual data
tested SpatialGlue on a postnatal day (P)22 mouse brain coronal sec- modalities (Fig. 3h). We also examined the modality weights for the
tion dataset acquired using spatial ATAC–RNA-seq3 to measure mRNA contribution of the different modalities toward each cluster (Fig. 3i).
and open chromatin regions. We used the Allen brain atlas reference For most clusters, the histone modification modality made a similar or
to annotate anatomical regions such as the cortex layers (ctx), genu of greater contribution. For example, clusters ctx-1, ctx-5, ctx-6 and ctx-12
corpus callosum (ccg), lateral septal nucleus (ls) and nucleus accum- had heavier weights assigned to the histone modification modality. To
bens (acb; Fig. 3a). For benchmarking, we tested SpatialGlue against ensure that the results were not contingent on dataset selection, we
Seurat, MultiVI, MOFA+, scMM and StabMap. We did not include again tested on two other P22 mouse brain datasets with RNA-seq and
MEFISTO and totalVI because we could not finish running MEFISTO CUT&Tag (H3K4me3 and H3K27me3 histone modification) modali-
within 12 h, and totalVI was designed only for CITE-seq. We first visu- ties. SpatialGlue was again the top method in both Moran’s I score and
alized the individual modalities (Fig. 3b), where we see that they cap- Jaccard similarity for these datasets (Extended Data Figs. 5 and 6).
tured various regions with differing accuracy. While both modalities We further analyzed the differentially expressed genes (DEGs)
captured the lateral ventricle (vl) and the lateral preoptic area (lpo), the of each cluster (Fig. 3j) and found known markers for the different
RNA modality clearly captured the ccg but was unable to differentiate brain regions such as myelin-related genes, Tspan2, Cldn11 and Ugt8a,
the ctx layers. Meanwhile, the ATAC modality was able to isolate the expressed in the postnatal developing corpus callosum (10-ccg/aco),
caudoputamen (cp) as well as some of the ctx layers. SpatialGlue and Olfm1, Cux2 and Rorb in the cortex layers. We next examined the
captured all the aforementioned anatomical regions (2-acb, 4-cp/13-cp, differentially expressed peaks in the H3K27ac histone modification
9-vl, 11-ccg/aco, 12-ls and 18-lpo) and produced better defined layers modality (Fig. 3k), where we found strong peaks in the clusters 12-ctx,
in the ctx and anterior cingulate area (aca) regions. Notably, Spatial- 10-ccg/aco, 4-acb and 7-cp. Finally, following the original study3, we
Glue was able to differentiate more ctx layers than all other methods plotted the peak-to-gene links heat map (Fig. 3l). Here, there are two
including the original analysis by Zhang et al.3. Seurat was able to cap- major groups appearing in both data modalities, the first primarily
ture the vl, acb, cp and ctx layers, making it the second-best method, consisting of acb/cp structures (4-acb, 7-cp and 14-cp), and the sec-
while the other methods could only capture the ccg and a few of the ond comprising ctx-related clusters (6-ctx, 11-ctx and 12-ctx). This
other structures. In general, the outputs of competing methods illustrated SpatialGlue’s success in combining information from both
presented more noise than SpatialGlue, which was quantitatively modalities into the latent space to enable biologically relevant clusters.
confirmed by the Moran’s I score (Fig. 3c). For the Jaccard similarity We believe such information combination has also contributed to the
metric, SpatialGlue again ranked top (Fig. 3d). We next examined the detection of the four cortical layers (clusters 5, 12, 1 and 6). Within the
cross-modality and intra-modality weights learned by SpatialGlue in the cortex layers 5 and 6 (cluster 6), Tle4, Fezf2, Foxp2 and Ntsr1 have been
aggregation layers. These weights denote the contribution of individ- reported in the literature as markers. However, we only found Tle4
ual modality’s features and spatial information toward the integrated and Fezf2 expression to spatially coincide with the cluster. Conversely,

Fig. 3 | SpatialGlue dissects spatial epigenome–transcriptome mouse brain methods. e, Modality weights of different modalities, denoting their importance
samples at higher resolution. a, Annotated reference of the mouse brain to the integrated output of SpatialGlue. f, Spatial plots of the RNA-seq and
coronal section from the Allen Mouse Brain Atlas. b, Spatial plots of the RNA-seq CUT&Tag-seq (H3K27ac) data with unimodal clustering (left), and clustering
and ATAC-seq data with unimodal clustering (left) and clustering results (right) results (right) from single-cell and spatial multi-omics integration methods—
from single-cell and spatial multi-omics integration methods—Seurat, MultiVI, Seurat, MultiVI, MOFA+, scMM, StabMap and SpatialGlue. The annotated labels
MOFA+, scMM, StabMap and SpatialGlue. The annotated labels correspond to correspond to SpatialGlue’s results and the clustering colors do not necessarily
SpatialGlue’s results and the clustering colors do not necessarily match to the match to the same structures for the other methods. g, Box plots of Moran’s I
same structures for the other methods. ctx, cerebral cortex; cp, caudoputamen; score of the six methods. In the box plot, the center line denotes the median, box
vl, lateral ventricle; lpo, lateral preoptic area; aca, anterior cingulate area; ls, limits denote the upper and lower quartiles, and whiskers denote 1.5 times the
lateral septal nucleus; aco, anterior commissure (olfactory limb); acb, nucleus interquartile range. n = 16 clusters. h, Comparison of Jaccard similarity scores
accumbens; cc, corpus callosum. c, Box plots of Moran’s I score of the six of the six methods. i, Modality weights of different modalities, denoting their
methods. In the box plot, the center line denotes the median, box limits denote importance to the integrated output of SpatialGlue. j, Heat map of DEGs for each
the upper and lower quartiles, and whiskers denote 1.5 times the interquartile cluster. k, Heat map of differentially expressed peaks for each cluster. l, Heat map
range. n = 18 clusters. d, Comparison of Jaccard similarity scores of the six of peak-to-gene links.

Nature Methods | Volume 21 | September 2024 | 1658–1667 1662


Article https://doi.org/10.1038/s41592-024-02316-4

Ntsr1’s gene activity score inferred from the histone marks matched Accurately resolving mouse thymus structures and spleen
the cluster spatially (Extended Data Fig. 7). This illustrated the differ- macrophage subsets with SpatialGlue
ent information within each modality that SpatialGlue can leverage to Lastly, we demonstrated that SpatialGlue is broadly applicable to
better demarcate different spatial domains. a wide spectrum of technology platforms by further applying it to

a b c
RNA Seurat MultiVI MOFA+
SpatialGlue clusters only 0.8

Moran’s I score
1-ctx 10-ctx 0.6
2-acb 11-ccg/aco
0.4
aca L6-L1 3-ctx 12-ls
ctx L6-L1 0.2
ccg 4-cp 13-cp
5-ctx/aca 14-ctx 0
ls cp ATAC scMM StabMap SpatialGlue
vl 6-ctx 15-aca

sc +

ia p
e
at

M I

St M
FA
tiV

lu
a
lpo

ur
16-ctx

Sp abM
M
7-ctx

lG
acb

O
ul
Se
M
aco 8-aca 17-ctx

at
9-vI 18-Ipo

i Modality weight
d e Modality weight g RNA H3K27ac
RNA ATAC 1.0
RNA-joint 1.0 0.8
0.06
Jaccard similarity

ATAC-joint 0.8

Moran’s I score
0.05 0.6 0.6
0.04
0.5
0.03 0.4 0.4

0.02 0.2
0.2
0.01 0
0
0
0
s +

ia p
e

g/ tx -9
ac -10

ct -13
cc c vI 8

cp-12

acx-14

ctx-16

18
c x-6

ls -11

5
ct c x-3
ac -4
cta-5

Ip x-17
ac x-1
ct -2

actx-7
M iVI

Sp ab M
FA

lu
at Ma

cta-1
a-

ct 10
8
cc vI-9

cp 13
cp 6

ct -11

6
ac -3
2

ct -5
ct 4
ct -1

12
b

o-

vl 5
x/ p

ac -7
o
St cM

ct

a-
x-
t

x-

-1

-1
lG

b-

1
x
sc +

ls
ia ap

x
O

at

x-
ul

g-
x
M I

St MM

ct
tiV
FA

lu
ur
M

Sp abM

lG
ul
O
Se
M

at
j
f

3- cg
1- tx

b
12 ca

2- tx

p
x
8- tx

13 tx
x

14-vl

ac
cp
-c

-c
ct
-c

vl
RNA Seurat MultiVI MOFA+

16 s
-c
ct
c
a

l
10
15
6-

9-
4-
5-

7-
11
h RNA-joint
Mef2c
Fam19a1
Nptxr
Rasgrf2
A830036E02Rik
Olfm1
Nrn1
Cacna2d1

0.08
Sphkap
Creg2
C1ql3
X3110047P20Rik
Tenm2
Lypd1

H3K27ac-joint
Cux2
Nell2
Nfix
Pam
Rorb
Camk2n1
Kcnh5
Plcxd2
Jaccard similarity

Cux1
Kcnh7
Rora
R3hdm1
Nefm

0.06
Cplx1
Slc24a2
Nefl
Scn1a
Efr3a
Nefh
Pex5l
Syt2
X3110035E14Rik
Ipcef1

2
Nptx1

Expression
Pde1a
Ncald
B3galt2
Garnl3
Sdk2
Slc17a7

1
Usp46
Nr4a2

0.04
Atp2b4
Gng2
Rgs12
Chl1

0
Gnb4
Col11a1
Cpne4
Map1b
Camk2a
Nrxn2
Apc

H3K27ac scMM StabMap SpatialGlue −1


Dlgap1
Napb
X1700020I14Rik
Rab3c
Syn2
Fam81a

−2
Slc1a2

0.02
Atp1a2
Glul
Cpe
Ppap2b
Apoe
Sparcl1
Chd3
Cst3
Slc1a3
Igf2
Col1a2
Cped1
Slc7a11
Timp3
Itih5
Ptgds
Cald1
Ptprb
Sparc

0
Mal
Mobp
Mbp
Plp1
Mag
Tspan2
Cldn11
Gsn
Cnp
Ugt8a
Zic1
Meg3
Cacna2d2
ia p
sc +

Dgkg
M VI

Trpc4
Col25a1
FA

lu
at Ma

Gad2
Arhgap6
Gfra1
M
ti

Top2a
lG

Hmgb2
Mki67
O
ul

Ccnd2
Sp tab

Zbtb20
Dlx6os1
Syne2
M

Sox4
Smc4
X2810055G20Rik
Pde10a
Gng7
S

Rgs9
Ppp1r1b
Gpr88
Adcy5
Scn4b
Dgkb
Penk
Ryr3
Dnah12
Dnah3
Spef2
Hydin
Wdr52
Dnah5
Gm973
Ttll3

1-ctx 2-ctx 3-ls 4-acb 5-ctx 6-ctx


Meis2
Cpne5
Tac1
Gnal
Foxp2

SpatialGlue
7-cp 8-aca 9-vI 10-ccg/aco 11-ctx 12-ctx
clusters only
13-unconfirmed 14-cp 15-unconfirmed 16-vI

l
k 1-ctx
H3K27ac z-scores

2-ctx
62012 P2GLinks

2
Chr1: 4703352−4703852 3-ls
Chr1: 9289592−9290092
Chr1: 18168202−18168702 4-acb
5-ctx
−2 6-ctx
Chr1: 4722489−4722989 7-cp
Chr1: 24230595−24231095
Chr1: 34189168−34189668
8-aca
62012 P2GLinks

Chr1: 32625908−32626408 2 9-vl


RNA z-scores

Chr1: 64792648−64793148
Chr1: 64919114−64919614 10-ccg/aco
Chr1: 7105057−7105557
Chr1: 21961062−21961562 11-ctx
Chr1: 35374499−35374999
Chr1: 149856751−149857251 12-ctx
Chr14: 71728406−71728906
Chr14: 72002148−72002648
Chr1: 135159078−135159578 −2 13
Chr11: 79792552−79793052
Chr15: 25761608−25762108 14-cp
Chr2: 27301414−27301914
Chr8: 35933566−35934066 15
Row z-scores 16-vl
8
-9
cp 0

-3
13
4

ct 2
ct 1

ct 6
ac 2

ac 4

15
ct 1

ac 5
ct 7

3,439 features
x-

a-
x-
g/ -1

-1
x-
b-
1

x-
-1

x-

vl

ls
cp
o-
cc ctx

vI

peakMatrix

−2 2

Nature Methods | Volume 21 | September 2024 | 1658–1667 1663


Article https://doi.org/10.1038/s41592-024-02316-4

Stereo-CITE-seq and SPOTS-acquired data. The Stereo-CITE-seq6 was The remaining methods captured clusters with similar Moran’s I scores
used to analyze a mouse thymus section, capturing mRNA and pro- but SpatialGlue scored the highest in terms of Jaccard similarity. We
tein at subcellular resolution (Fig. 4a). The thymus is a small gland then examined SpatialGlue’s learned modality weights (Fig. 4i). The
surrounded by a capsule of fibers and collagen. It is divided into protein modality made the bigger contribution to the MMMΦ cluster,
two lobes connected by a connective isthmus with each lobe being which was mainly found in the protein modality plot. Conversely, Spa-
broadly divided into a central medulla surrounded by an outer cortex tialGlue relied more on the RNA modality to capture the T cell cluster.
layer. In each data modality, broad outlines of the medulla regions To verify SpatialGlue’s performance, we used another SPOTS-acquired
and the surrounding cortex could be seen (Fig. 4a). We tested eight dataset of a murine spleen section as a replicate. Here, SpatialGlue
methods—Seurat, totalVI, MultiVI, MOFA+, MEFISTO, scMM, StabMap achieved comparable or greater Moran’s I scores than baseline meth-
and SpatialGlue. MultiVI and StabMap were unable to find coherent ods and scored the highest in terms of Jaccard similarity (Extended
clusters that resembled the medulla and cortex regions within the Data Fig. 10e,f).
thymus. This was clearly reflected in the Moran’s I score and Jaccard To annotate the clusters found with SpatialGlue, we visualized the
similarity with these two methods scoring the lowest (Fig. 4b,c). For cell types’ protein markers (Fig. 4j,k) and RNA expression of selected
MultiVI, its poor performance on protein + RNA data (Figs. 2 and 4) markers (Fig. 4l). Within the white pulp zone, the T cell spots were
was probably due to it being optimized for RNA + ATAC data. Seurat, known to concentrate in small clusters known as T cell zones, while the
totalVI, scMM and SpatialGlue were more successful in capturing B cell-enriched spots were mainly found in areas adjacent to the T cell
the internal structures by separating the medulla from the cortex, clusters. The RpMΦ spots in the red pulp zone were easily identified by
with SpatialGlue and scMM better demarcating the corticomedul- the strong expression of markers like F4_80 and CD163. To differenti-
lary junction and the inner, middle and outer cortex (clusters 2–5; ate MZMΦ and MMMΦ, the RNA expression of Cd209a (MZMΦ) and
Fig. 4a and Extended Data Fig. 8e). Overall, SpatialGlue scored the Siglec1 (MMMΦ) and protein expression of CD169 (MMMΦ) were used
highest in Jaccard similarity and second in Moran’s I score. This supe- to guide the annotation.
rior performance was further replicated with three additional mouse From the cluster and marker visualization, we observed cell
thymus sections (Supplementary Figs. 6–8). For most clusters, the types that were spatially adjoining. Thus, we quantified the spatial
RNA modality made greater contributions than the protein (Fig. 4d), relationships by computing neighborhood enrichment (Extended
except for the middle cortex (cluster 4), where the protein modality Data Fig. 9d) and co-occurrence probabilities with respect to the T cell
contributed more. and B cell clusters (Extended Data Fig. 9e). In general, we observed
In our final example, we benchmarked SpatialGlue’s capabilities neighborhood enrichments that matched known biology such as the
with mouse spleen spatial profiling data consisting of protein and high correlation between the B cells and T cells, indicating that they
transcript measurements4. The spleen is an important organ within were most likely to be found together at the closest distance. This was
the lymphatic system with functions including B cell maturation in followed by MMMΦ, which surrounded T cell and B cell clusters in the
germinal centers formed within B cell follicles (Fig. 4e). These are white zone. These correlations reflected the layers of cell types that
complex structures with an array of immune cells present. The data form the follicles and their surroundings. Between the macrophages,
were generated with SPOTS, which uses the 10x Genomics Visium tech- we observed a positive correlation between RpMΦ and MZMΦ, and
nology to capture whole transcriptomes and extracellular proteins via MZMΦ and MMMΦ. This is a result of the red pulp (RpMΦ) forming
polyadenylated antibody-derived tag (ADT)-conjugated antibodies. the spleen’s outer layer and the MZMΦ being positioned within the
The protein detection panel used for this experiment was designed to marginal zone surrounding white pulp, which in turn was enriched
detect the surface markers of B cells, T cells and macrophages, which with MMMΦ.
are well represented in the spleen. After preprocessing, we performed
clustering of each data modality and plotted the clusters on the tissue Discussion
slide to examine their correspondence between modalities (Fig. 4f). SpatialGlue is a new deep learning model incorporating graph neural
The clusters clearly did not align, indicating that each modality pos- networks with a dual-attention mechanism that enables the integration
sessed different information content (Extended Data Fig. 9f,g). Using of multi-omics data in a spatially aware manner. With the presented
the protein markers and DEGs, clusters of spots enriched with B cells, examples, we demonstrated SpatialGlue’s ability to effectively inte-
T cells and macrophage subsets were annotated22–24. Specifically, we grate multiple data modalities with their respective spatial context
identified macrophage subsets (RpMΦ, MZMΦ, MMMΦ) that were not to reveal histologically relevant structures of tissue samples. Further-
annotated in the original study. We then tested Seurat, totalVI, MultiVI, more, our quantitative benchmarking demonstrated that SpatialGlue
MOFA+, MEFISTO, scMM, StabMap and SpatialGlue (Fig. 4f). MultiVI exhibits superior performance to 10 state-of-the-art unimodal and
and StabMap again did not capture coherent clusters. This was also non-spatial methods on 5 simulated datasets and 12 real datasets,
reflected in their Moran’s I scores and Jaccard similarity (Fig. 4g,h). highlighting the importance of spatial information and cross-omics

Fig. 4 | SpatialGlue accurately integrates multimodal data from the mouse using SPOTS) with unimodal clustering (left), and clustering results (right)
thymus (RNA and protein acquired with Stereo-CITE-seq) and mouse from single-cell and spatial multi-omics integration methods—Seurat, totalVI,
spleen (RNA and protein acquired using SPOTS). a, Spatial plots of RNA and MultiVI, MOFA+, MEFISTO, scMM, StabMap and SpatialGlue. RpMΦ, MMMΦ and
protein data (mouse thymus acquired with Stereo-CITE-seq) with unimodal MZMΦ are red pulp macro, CD169 + MMM and CD209a+ MZM, respectively. The
clustering (left), and comparison of clustering results (right) from single-cell annotated labels correspond to SpatialGlue’s results and the clustering colors do
and spatial multi-omics integration methods—Seurat, totalVI, MultiVI, MOFA+, not necessarily match to the same structures for the other methods. g, Box plots
MEFISTO, scMM, StabMap and SpatialGlue. The annotated labels correspond to of Moran’s I score of the eight methods. In the box plot, the center line denotes
SpatialGlue’s results and the clustering colors do not necessarily match to the the median, box limits denote the upper and lower quartiles, and whiskers
same structures for the other methods. b, Box plots of Moran’s I score of the eight denote 1.5 times the interquartile range. n = 5 clusters. h, Comparison of Jaccard
methods. In the box plot, the center line denotes the median, box limits denote similarity scores of the eight methods. i, Modality weights of different modalities,
the upper and lower quartiles, and whiskers denote 1.5 times the interquartile denoting their importance to the integrated output of SpatialGlue. j, Heat map
range. n = 8 clusters. c, Comparison of Jaccard similarity scores of the eight of differentially expressed ADTs for each cluster. k, Normalized ADT levels of key
methods. d, Modality weights of different modalities, denoting their importance surface markers for T cells (CD3, CD4, CD8), B cells (IgD, B220, CD19) and RpMΦ
to the integrated output of SpatialGlue. e, Histology image of the mouse spleen (F4_80, CD68, CD163). l, Violin plots of two marker genes in the MMMΦ, MZMΦ
replicate 1 sample. f, Spatial plots RNA and protein data (mouse spleen acquired and RpMΦ clusters.

Nature Methods | Volume 21 | September 2024 | 1658–1667 1664


Article https://doi.org/10.1038/s41592-024-02316-4

integration. We also demonstrated SpatialGlue’s ability to resolve brain epigenome–transcriptome data revealed finer cortical layers
finer-grained tissue structures, which can facilitate new biological compared to the original study, which can allow further investigation
findings in future studies. For example, its application to a set of mouse of gene regulation at a higher spatial resolution. Our examples also

a
RNA Seurat TotalVI MultiVI MOFA+
SpatialGlue clusters only

1-Medulla (SP T, mTEC, DC)


2-Corticomedullary junction (CMJ)
3-Inner cortex region 1 (DN T, DP T,
cTEC)
4-Middle cortex region 2 (DN T, DP T,
cTEC)
Protein MEFISTO scMM StabMap SpatialGlue 5-Outer cortex region 3 (DN T, DP T,
cTEC)
6-Connective tissue capsule (fibroblast)
7-Subcapsular zone (DN T)
8-Connective tissue capsule (fibroblast,
RBC, myeloid)
Modality weight

b c 0.25 RNA-joint
d e Histological image
0.8 RNA Protein
Protein-joint 1.0
Jaccard similarity

0.20
Moran’s I score

0.6 0.8
0.15
0.6
0.4
0.10 0.4
0.2
0.05 0.2
0 0
0
1 2 3 4 5 6 7 8
at bM +
lG p
e
sc TO
at

EF +

ia ap
e

M ult I
s O

IS I

M MM
M lVI

O I

Sp ab M

M alV
EF iV

SpSta FA

lu
M tiV

ia a
M FA

lu
ur

T
St cM
at M
ta

lG
IS

O
ul

t
Se

to
to

g
f RNA Seurat totalVI MultiVI MOFA+
SpatialGlue 0.6

Moran’s I score
clusters only 0.5
0.4
MZMØ
0.3
MMMØ 0.2
RpMØ 0.1
Protein MEFISTO scMM StabMap SpatialGlue
B cell 0

T cell

EF A+

ia ap
e
to at

s O
Sp ab M
ul I
M tiVI
M lV

lu
T
ur

St cM
at M
M OF

lG
IS
ta
Se

h RNA-joint i Modality weight

j
0.175 Protein-joint
Ø
RNA Protein
Ø
Ø

ll
M

1.0

ll
ZM
M
Jaccard similarity

ce
ce
0.150
M
Rp

B
T
0.125
0.100
F4−80
0.5 CD163
0.075 CD68
0.050 CD29
0.025 CD105
0
Ly6G
0
CD20
Sp abM +
ia p
e

EpCAM
M TO
M VI

sc I
M MM

St FA
tiV

lu
a

Ø
ll

ll
l

ce

ce
lG
ta

IS

ZM

M
O
ul

Ly6C
to

M
EF

Rp
B

T
M
M
at

CD11b
k l IgM
T cell B cell RpMØ CD45R
CD3 IgD F4_80 3.0 CD19
2.0
2.0 2.5 CD169
2.0 CD38
Cd209a

2 1.5
1.5
1.5 IgD
1.0
1.0
1 1.0 CD3
0.5 0.5
0.5 CD4
CD4 CD45R CD68 0 CD8
2.0
Ø

1.5
M

ZM

1.5
M

Rp

1.5
M
M

1.0 5
1.0 1.0
4
Siglec1

0.5 0.5 0.5 3


CD8 CD19 CD163
2.0 2
1.5 1.25
1
1.5 1.00
1.0
0
1.0 0.75
Ø

Ø
M

ZM

M
M

Rp

0.50
M
M

0.5 0.5

Nature Methods | Volume 21 | September 2024 | 1658–1667 1665


Article https://doi.org/10.1038/s41592-024-02316-4

spanned four different tissue types and four technology platforms 3. Zhang, D. et al. Spatial epigenome–transcriptome co-profiling of
to show its broad applicability. Despite having demonstrated its mammalian tissues. Nature 616, 113–122 (2023).
appli­cation only on sequencing-based spatial omics data, its design 4. Ben-Chetrit, N. et al. Integration of whole transcriptome spatial
allows seamless extension to image-based omics data from technology profiling with protein markers. Nat. Biotechnol. 41, 788–793
platforms like 10x Genomics Xenium and Nanostring CosMx, exhibiting (2023).
a technology-agnostic nature. 5. Vickovic, S. et al. SM-Omics is an automated platform for high-
As a GNN-based method, SpatialGlue bears such similarity to throughput spatial multi-omics. Nat. Commun. 13, 795 (2022).
other GNN methods such as GraphST and SpaGCN that were designed 6. Liao, S. et al. Integrated spatial transcriptomic and proteomic
for spatial unimodal omic. Naturally, as a method tailored for spatial analysis of fresh frozen tissue based on Stereo-seq. Preprint at
multi-omics, SpatialGlue is different as it is explicitly designed to take bioRxiv https://doi.org/10.1101/2023.04.28.538364 (2023).
in multiple data modalities as input and use attention to integrate 7. Hudson, W. H. & Sudmeier, L. J. Localization of T cell clonotypes
data, as opposed to concatenation at the data preprocessing step. using the Visium spatial transcriptomics platform. STAR Protoc. 3,
Unlike other existing multimodal methods such as Seurat, totalVI, 101391 (2022).
MultiVI, MOFA+, scMM and StabMap, our model is spatially informed 8. Janesick, A. et al. High resolution mapping of the tumor
and it employs attention mechanisms to adaptively learn the relative microenvironment using integrated single-cell, spatial and in situ
importance between omics modalities, and between spatial location analysis. Nat. Commun. 14, 8353 (2023).
and omics features within each modality. Currently, all input modali- 9. Takei, Y. et al. Integrated spatial genomics reveals global
ties to SpatialGlue are preprocessed and dimensionally reduced to the architecture of single nuclei. Nature 590, 344–350 (2021).
same number of dimensions. When one data modality has substan- 10. Su, J. -H., Zheng, P., Kinrot, S. S., Bintu, B. & Zhuang, X. Genome-
tially fewer dimensions than the rest, the resulting input dimension scale imaging of the 3D organization and transcriptional activity
restriction on all modalities may result in some information loss. At of chromatin. Cell 182, 1641–1659 (2020).
present, the protein modality is most likely to be the most restrictive 11. Liu, J. et al. Concordance of MERFISH spatial transcriptomics
owing to acquisition technology limitations such as in surface protein with bulk and single-cell RNA sequencing. Life Sci. Alliance 6,
detection. For example, SPOTS can simultaneously capture up to 32 e202201701 (2023).
surface proteins. While a larger number of proteins and consequently 12. He, S. et al. High-plex imaging of RNA and proteins at subcellular
dimensions is preferred, 32 is not overly restrictive in our opinion. resolution in fixed tissue by spatial molecular imaging.
We believe that the number of proteins captured will increase with Nat. Biotechnol. 40, 1794–1806 (2022).
technological advances and consequently eliminate this restriction. 13. Hao, Y. et al. Integrated analysis of multimodal single-cell data.
Moreover, we plan to upgrade our model to work with full feature sets Cell 184, 3573–3587 (2021).
and a modality-specific number of latent dimensions to reflect the 14. Argelaguet, R. et al. MOFA+: a statistical framework for
respective data modality complexity. comprehensive integration of multi-modal single-cell data.
We designed SpatialGlue to be computation resource efficient Genome Biol. https://doi.org/10.1186/s13059-020-02015-1
and thus ensure its relevance as data sizes increase. The largest data- (2020).
set tested contains 9,752 spots (spatial epigenome–transcriptome 15. Ghazanfar, S., Guibentif, C. & Marioni, J. C. Stabilized mosaic
mouse brain), and it required about 5 min of wall-clock time on a server single-cell data integration using unshared features.
equipped with an Intel Core i7-8665U CPU and NVIDIA RTX A6000 Nat. Biotechnol. 42, 284–292 (2024).
GPU. Our testing showed that it scales well with the number of fea- 16. Gayoso, A. et al. Joint probabilistic modeling of single-cell
tures and cells/spots (Extended Data Fig. 1g,h). Therefore, we believe multi-omic data with totalVI. Nat. Methods 18, 272–282 (2021).
SpatialGlue will be an invaluable analysis tool for present and future 17. Ashuach, T. et al. MultiVI: deep generative model for the
spatial multi-omics data. There are also multiple avenues of possible integration of multimodal data. Nat. Methods 20, 1222–1231
extensions for SpatialGlue. One is to include images as a modality. Most (2023).
technologies can produce accompanying imaging data such as H&E, 18. Minoura, K., Abe, K., Nam, H., Nishikawa, H. & Shimamura, T.
which contains essential information of the cell and tissue morphol- A mixture-of-experts deep generative model for integrated
ogy. Thus, we plan to extend SpatialGlue to incorporate image data analysis of single-cell multiomics data. Cell Rep. Methods 1,
at either the intra-modality or inter-modality attention aggregation 100071 (2021).
layer. We also aim to extend SpatialGlue’s functionality with integra- 19. Dong, K. & Zhang, S. Deciphering spatial domains from spatially
tion of multi-omics data acquired from serial tissue sections. Such resolved transcriptomics with an adaptive graph attention
multi-section integration can involve sections with the same data auto-encoder. Nat. Commun. 13, 1739 (2022).
modalities or even mosaic integration of different modalities captured 20. Hu, J. et al. SpaGCN: Integrating gene expression, spatial location
by different sections. and histology to identify spatial domains and spatially variable
genes by graph convolutional network. Nat. Methods 18, 1342–1351
Online content (2021).
Any methods, additional references, Nature Portfolio reporting sum- 21. Long, Y. et al. Spatially informed clustering, integration,
maries, source data, extended data, supplementary information, and deconvolution of spatial transcriptomics with GraphST.
acknowledgements, peer review information; details of author contri- Nat. Commun. 14, 1155 (2023).
butions and competing interests; and statements of data and code avail- 22. Alexandre, Y. O. & Mueller, S. N. Splenic stromal niches in
ability are available at https://doi.org/10.1038/s41592-024-02316-4. homeostasis and immunity. Nat. Rev. Immunol. 23, 705–719
(2023).
References 23. Borges da Silva, H. et al. Splenic macrophage subsets and their
1. Liu, Y. et al. High-spatial-resolution multi-omics sequencing function during blood-borne infections. Front. Immunol. 6, 480
via deterministic barcoding in tissue. Cell 183, 1665–1681 (2015).
(2020). 24. Backer, R. et al. Effective collaboration between marginal
2. Liu, Y. et al. High-plex protein and whole transcriptome co-mapping metallophilic macrophages and CD8+ dendritic cells in the
at cellular resolution with spatial CITE-seq. Nat. Biotechnol. 41, generation of cytotoxic T cells. Proc. Natl Acad. Sci. USA 107,
1405–1409 (2023). 216–221 (2010).

Nature Methods | Volume 21 | September 2024 | 1658–1667 1666


Article https://doi.org/10.1038/s41592-024-02316-4

Publisher’s note Springer Nature remains neutral with regard to article are included in the article’s Creative Commons licence, unless
jurisdictional claims in published maps and institutional affiliations. indicated otherwise in a credit line to the material. If material is not
included in the article’s Creative Commons licence and your intended
Open Access This article is licensed under a Creative Commons use is not permitted by statutory regulation or exceeds the permitted
Attribution 4.0 International License, which permits use, sharing, use, you will need to obtain permission directly from the copyright
adaptation, distribution and reproduction in any medium or format, holder. To view a copy of this licence, visit http://creativecommons.
as long as you give appropriate credit to the original author(s) and the org/licenses/by/4.0/.
source, provide a link to the Creative Commons licence, and indicate
if changes were made. The images or other third party material in this © The Author(s) 2024

1
Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore. 2Binformatics Institute
(BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore. 3BGI-Shenzhen, Shenzhen, China. 4BGI Research-Southwest,
BGI, Chongqing, China. 5The Ken & Ruth Davee Department of Neurology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
6
Department of Biomedical Engineering, Yale University, New Haven, CT, USA. 7Singapore Immunology Network (SIgN), Agency for Science, Technology
and Research (A*STAR), Singapore, Singapore. 8Immunology Translational Research Programme, Yong Loo Lin School of Medicine, National University
of Singapore, Singapore, Singapore. 9Department of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore,
Singapore, Singapore. 10BGI Research Asia-Pacific, BGI, Singapore, Singapore. 11Shanghai Immune Therapy Institute, Shanghai Jiao Tong University
School of Medicine Affiliated Renji Hospital, Shanghai, China. 12JFL-BGI STOmics Center, Jinfeng Laboratory, Chongqing, China. 13Cancer Translational
Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. 14Center for Computational Biology and
Program in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, Singapore. e-mail: [email protected]

Nature Methods | Volume 21 | September 2024 | 1658–1667 1667


Article https://doi.org/10.1038/s41592-024-02316-4

Methods expression counts were then normalized using centered log ratio across
Data each bin. PCA was then performed on the normalized data, and all 22
Human lymph node dataset. For spatial transcriptomics analysis of principal components were used as the input of the encoder.
human tissues, two sequential sections of 5-µm thickness were uti-
lized from formalin-fixed, paraffin-embedded (FFPE) lymph nodes. SPOTS mouse spleen dataset. Ben-Chetrit et al.4 processed fresh
The sections underwent spatial transcriptomic library construction frozen mouse spleen tissue samples and analyzed them using the 10x
using CytAssist Visium platform (10x Genomics). Initially, sections Genomics Visium system supplemented with DNA-barcoded antibody
were stained with H&E following the protocol outlined in the Visium staining. The antibodies (ADTs) enabled protein measurement along-
CytAssist guide for FFPE samples, which includes steps for depar- side the transcriptome profiling using 10x Genomics Visium. The panel
affinization, staining, imaging and decrosslinking (CG000658, 10x of 21 ADTs was designed to capture the markers of immune cells found
Genomics). Imaging was performed with a ×20 objective on an EVOS in the spleen, including B cells, T cells and macrophages. We used two
M7000 microscope (Thermo). datasets (replicates 1 and 2) from the original study. The data contained
Following imaging, spatial gene expression libraries were prepared 2,568 and 2,768 spots for replicates 1 and 2, respectively, with 32,285
utilizing probe-based methods, along with spatial protein expression genes captured per spot. For data preprocessing, we first filtered out
libraries per the guidelines provided in the Visium CytAssist Reagent genes expressed in fewer than 10 spots. The filtered gene expression
Kits manual (CG000494, 10x Genomics). We used the Visium Human counts were then log-transformed and normalized by library size using
Transcriptome Probe Set version 2.0 for RNA transcript detection, the SCANPY package25. Finally, the top 3,000 HVGs were selected and
along with the Human FFPE Immune Profiling Panel, which includes used as input for PCA. We used the first 21 principal components as the
a 35-plex CytAssist Panel of antibodies, both intracellular and extra­ input of the encoder to ensure a consistent input dimension with the
cellular, sourced from BioLegend and Abcam for protein detection. ADT data. For the ADT data, we applied centered log ratio normaliza-
This panel also comprises four isotype controls. Antibody signals were tion to the raw protein expression counts. PCA was then performed on
normalized to isotype controls. the normalized data and the top 21 principal components were used
Libraries were sequenced on an Illumina NovaSeq S2 PE50 plat- as input to the encoder.
form, allocating 2,000 million reads per lane at the NUSeq Core Facility,
Northwestern University. The resultant FASTQ files were processed The SpatialGlue framework
using Space Ranger (v2.1.0) software, referencing the GRCh38 human SpatialGlue is a new graph-based model with a dual-attention mecha-
genome (GENCODE v32/Ensembl 98). For precise anatomical context, nism that aims to learn a unified representation by fully exploiting
we conducted manual annotation of the lymph node structures, the spatial location information and expression data from different
utilizing the high-resolution images captured by the EVOS M7000 omics modalities. Within each modality, SpatialGlue first learns a
microscope within the Loupe Browser software (10x Genomics). modality-specific representation using both spatial and omics data.
Subsequently, it synthesizes an integrated cross-modality repre-
Spatial epigenome–transcriptome mouse brain dataset. Brain tissue sentation by aggregating these modality-specific representations.
sections from a juvenile (P22) mouse were analyzed for the epigenome Compared to cross-omics integration first followed by spatial inte-
and transcriptome using spatial ATAC–RNA-seq and CUT&Tag-RNA-seq gration, our approach allows us to capture modality-specific spatial
by Zhang et al.3. Microfluidic barcoding was used to capture spatial correlations between spots and integrate the spatial information in a
location and combined with in situ Tn5 transposition chemistry to modality-specific manner.
capture chromatin accessibility. We used four datasets, one spatial We first consider a spatial multi-omics dataset with two different
ATAC–RNA-seq dataset and three spatial CUT&Tag-RNA-seq datasets. omics modalities, each with a distinct feature set X1 ∈ RN×d1 and
The number of pixels ranged from 9,215 to 9,752, the number of genes X2 ∈ RN×d2. N denotes the number of spots in the tissue. d1 and d2 are the
ranged from 22,731 to 25,881 and the number of peaks ranged from numbers of features for two omics modalities, respectively. For exam-
35,270 to 121,068. ple, in spatial epigenome–transcriptome, X1 and X2 refer to the sets of
To preprocess the transcriptomic data, pixels expressing fewer genes and chromatin regions respectively, while in Stereo-CITE-seq,
than 200 genes and genes expressing fewer than 200 pixels were filtered X1 and X2 refer to the sets of genes and proteins, respectively. The
out. Next, the gene expression counts were log-transformed and nor- primary objective of spatial multi-omics data integration is to learn a
malized by library size via the SCANPY package25. The top 3,000 highly mapping function that can project the original individual modality data
variable genes (HVGs) were selected and used as input to PCA for dimen- into a uniform latent space and then integrate the resulting representa-
sionality reduction. For consistency with the chromatin peak data, the tions. As shown in Fig. 1a, the SpatialGlue framework consists of
first 50 principal components were retained and used as input to the four major modules: (1) modality-specific graph convolution network
encoder. For the chromatin peak data, we used latent semantic index- (GCN) encoder, (2) within-modality attention aggregation layer,
ing to reduce the raw chromatin peak counts data to 50 dimensions. (3) between-modality attention aggregation layer, and (4) modality-
specific GCN decoder. The details of each module are described next.
Stereo-CITE-seq mouse thymus dataset. Murine thymus tissue sam- Notably, here we demonstrate the SpatialGlue framework with two
ples were investigated with Stereo-CITE-seq for spatial multi-omics by modalities. Benefiting from the modular design, SpatialGlue readily
Liao et al.6. For our study, we used data from four sections. The number extends to spatial multi-omics data with more than two modalities.
of bins ranged from 4,228 to 4,697, the number of genes ranged from
23,221 to 23,960 and the sample included 19 or 51 proteins. For the Construction of neighbor graph
transcriptomic data, we first filtered out genes expressed in fewer Assuming spots that are spatially adjacent in a tissue usually have
than 10 bins and bins with fewer than 80 genes expressed. The filtered similar cell types or cell states, we convert the spatial information to
gene expression counts were next log-transformed and normalized by an undirected neighbor graph Gs = (V, E) with V denoting the set of
library size via the SCANPY package25. Finally, to reduce the dimension- N spots and E denoting the set of connected edges between spots. Let
ality of the data, the top 3,000 HVGs were selected and used as input As ∈ RN×N be the adjacent matrix of graph Gs, where As (i, j) = 1 if and
for PCA. To ensure a consistent input dimension with the ADT data, the only if the Euclidean distance between spots i and j is less than specific
first 22 principal components were retained and used as the input of the neighbor number r, otherwise 0. In our examples, we selected the top
encoder. For the ADT data, we first filtered out proteins expressed in r = 3 nearest spots as neighbors of a given spot for all datasets according
fewer than 50 bins, resulting in 22 proteins being retained. The protein to experimental results.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

In a complex tissue sample, it is possible for spots with the same mentioned above), the aggregation layer is designed to integrate
cell types/states to be spatially nonadjacent to each other, or even far graph-specific representations in an adaptive manner by capturing
away. To capture the proximity of such spots in a latent space, the importance of each graph. As a result, we derive a modality-
we explicitly model the relationship between them using a feature specific representation for each modality. Specifically, for a given spot
graph. Specifically, we apply the KNN algorithm on the PCA embed- i, we first subject its graph-specific representation hti to a linear trans-
dings and construct the feature graph Gm f
= (Vm , Em ), where Vm and Em formation (that is, a fully connected neural network), and then evaluate
denote the sets of N spots and connected edges between spots in the the importance of each graph by the similarity of the transformed
m ∈ {1, 2} -th modality, respectively. For a given spot, we choose the representation and a trainable weight vector q. Formally, the attention
top k-nearest spots as its neighbors. By default, we set k to 20 for coefficient eti , representing the importance of graph t to the spot i, is
all datasets. We use Am f
∈ RN×N to denote the adjacency matrix of calculated by equation (5):
the feature graph Gm f
. If spot j ∈ Vm is the neighbor of spot i ∈ Vm ,
then Am (i, j) = 1 , otherwise 0. eti = qT • tanh (W intra
i
hti + bintra
i
) (5)
f

Graph convolutional encoder for individual modality where Wiintra and bintra
i
are the trainable weight matrix and bias vector,
Each modality (for example, mRNA or protein) contains a unique fea- respectively. To reduce the number of parameters in the model, all the
ture distribution. To encode each modality in a low-dimensional trainable parameters are shared by the different graph-specific repre-
embedding space, we use the GCN26, an unsupervised deep graph sentations within each modality. To make the attention coefficient
network, as the encoder of our framework. The main advantage of the comparable across different graphs, a softmax function is applied to
GCN is that it can capture the cell expression patterns and neighbor- the attention coefficient to derive attention score αti , according to
hood microenvironment while preserving the high-level global pat- equation (6):
terns. For each modality, using the preprocessed features as inputs,
exp (eti )
we separately implement a GCN encoder on the spatial adjacency graph αti = (6)
T
Gs and the feature graph Gf to learn graph-specific representations H. ∑t=1 exp (eti )
These two neighbor graphs reflect distinct topological semantic rela-
tionships between spots. The semantic information in the spatial graph where T denotes the number of neighbor graphs (set to 2 here). αti
denotes the physical proximity between spots, while that in the feature represents the semantic contribution of the t-th neighbor graph to
graph denotes the phenotypic proximity of spots that have the same the representation of spot i. A higher value of αti means greater
cell types/states but are spatially nonadjacent to each other. This ena- contribution.
bles the encoder to capture different local patterns and dependencies Subsequently, the final representation Ym in the m-th modality can
of each spot by iteratively aggregating the representations from its be generated by aggregating graph-specific representations according
neighbors. Specifically, the l-th ( l ∈ {1, 2, … , L − 1, L}) layer representation to their attention scores according to equation (7):
in the encoder are formulated according to equations (1–4):
T

Hls1 = σ (Ãs Hl−1 Wl−1 + bl−1 ) (1) ym


i
= ∑ αti • hti (7)
s1 e1 e1 t=1

Hlf1 = σ (Ã1f Hl−1 Wl−1 + bl−1 ) (2) such that ymi


∈ Rd3 preserves the raw spot expressions, spot expres-
f1 e1 e1
sion similarity and spatial proximity within modality m.

Hls2 = σ (Ãs Hl−1 Wl−1 + bl−1 ) (3)


s2 e2 e2 Between-modality attention aggregation layer
Each individual omics modality provides a partial view of a complex
Hlf 2 = σ (Ã2f Hl−1
f2
Wl−1
e2
+ bl−1
e2
) (4) tissue sample, thus requiring an integrated analysis to obtain a com-
1 1
prehensive picture. These views can contain both complementary and
where à = D− 2 AD− 2 represents the normalized adjacency matrix of contradictory elements, and thus different importance should be
specific graph and D is a diagonal matrix with diagonal elements assigned to each modality to achieve coherent data integration. Here
N
being Dii = ∑ j=1 Aij . In particular, Ãs , Ã1f , and Ã2f are the corresponding we use a between-modality attention aggregation layer to adaptively
normalized adjacency matrices of the spatial graph, the feature integrate the different data modalities. This attention aggregation
graphs of modalities 1 and 2, respectively. We•, and be• denote a trainable layer will focus on the more important omics modality by assigning
weight matrix and a bias vector, respectively. σ(•) is a nonlinear greater weight values to the corresponding representation. Like the
acti­vation function such as the rectified linear unit. Hl• denotes the within-modality layer, we first learn the importance of modality m by
l-th layer output representation, and H0s1 = H0f1 and H0s2 = H0f 2 are set calculating the following coefficient gmi
according to equation (8):
as the input PCA embeddings of the original features X1 and X2, respec-
tively. We also specify HL• ∈ Rd3 , the output at the L-th layer, as the gm
i
= vT • tanh (Wiinter ym
i
+ binter
i
) (8)
final latent representation of the encoder with d3 as the hidden dimen-
sion. Hsm and Hfm represent the latent representations derived from where gm i
is the attention coefficient that represents the importance
the spatial and feature graphs within modality m, respectively. of the modality m to the representation of spot i. Winter, binter, and v are
learnable weight and bias variables, respectively. Similarly, we further
Within-modality attention aggregation layer normalize the attention coefficients using the softmax function accord-
For an individual modality, taking its preprocessed features and two ing to equation (9):
graphs (that is, spatial and feature graphs) as input, we can derive two
exp ( gim )
graph-specific spot representations via the graph convolutional βm = (9)
i M
encoder, such as Hs1 and Hf1. To integrate the graph-specific representa- ∑m=1 exp ( gim )
tions, we design a within-modality attention aggregation layer follow-
ing the encoder such that its output representation preserves where βmi
is the normalized attention score that represents the contri-
expression similarity and spatial proximity. Given that different neigh- bution of the modality m to the representation of spot i. M is the number
bor graphs can provide unique semantic information for each spot (as of modalities.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Finally, we derive the final representation matrix Z by aggregating where γ3 and γ4 are hyper-parameters, controlling the influences
0 0
each modality-specific representation according to attention score β of different modality data. We set Y1̂ = Y1 and Y2̂ = Y2 . σ(•) which is a
given by equation (10): nonlinear activation function, that is, rectified linear unit.
M
Therefore, the overall loss function used for model training is
zi = ∑ βm • ym (10) defined according to equation (17):
i i
m=1
ℒtotal = ℒrecon + ℒcorr (17)

After model training, the latent representation zi ∈ Rd3 can be


used in various downstream analyses, including clustering, visualiza- Implementation details of SpatialGlue
tion and DEG detection. For all datasets, a learning rate of 0.0001 was used. To account for dif-
ferences in feature distribution across the datasets, a tailored group of
Model training of SpatialGlue weight factors [γ1, γ2, γ3, γ4] was empirically assigned to each one. The
The resulting model is trained jointly with two different loss functions, weight factors were [1, 5] for the SPOTS mouse spleen dataset, [1, 5, 10]
that is, reconstruction loss and correspondence loss. Each loss function for the 10x Genomics Visium human lymph node dataset, [1, 10] for the
is described as follows. Stereo-CITE-seq mouse thymus dataset, [1, 5] for the spatial epigenome–
transcriptome mouse brain dataset. We also provided a default parameter
Reconstruction loss. To enforce the learned latent representation to set that would work for most users on most data types. The training epochs
preserve the expression profiles from different modalities, we design used for the SPOTS mouse spleen, 10x Genomics Visium human lymph
an individual decoder for each modality to reverse the integrated node, Stereo-CITE-seq mouse thymus and spatial epigenome–transcrip-
representation Z back into the normalized expression space. Specifi- tome mouse brain datasets were 600, 200, 1,500 and 1,600, respectively.
cally, by taking output Z from the between-modality attention aggre­
gation layer as input, the reconstructed representations Ĥl1 and Data and detailed methods
Ĥl2 from the decoder at the l-th ( l ∈ {1, 2, … , L − 1, L} ) layer are formu­ Details on the datasets, downstream analyses, competing methods and
lated as follows according to equations (11) and (12): metrics used are available in the Supplementary Information.
l−1 l−1
Ĥl1 = σ (Ãs Z 1 W d1 + bl−1
d1
) (11) Reporting summary
Further information on research design is available in the Nature
l Portfolio Reporting Summary linked to this article.
H2̂ = σ (Ãs Zl−1
1
Wl−1
d2
+ bl−1
d2
) (12)

Data availability
where Wd1, Wd2, bd1 and bd2 are trainable weight matrices and bias The SPOTS mouse spleen data were obtained from the Gene Expres-
l l
vectors, respectively. H1̂ and H2̂ represent the reconstructed express­ sion Omnibus (GEO) repository (accession no. GSE198353)4, the
ion matrices for the omics modalities 1 and 2, respectively. Stereo-CITE-seq mouse thymus data from BGI and the spatial epig-
SpatialGlue’s objective function to minimize the expression enome–transcriptome mouse brain data from AtlasXplore (https://
reconstruction loss is given by equation (13): web.atlasxomics.com/visualization/Fan/)3. GRCh38.p13 human
genome data were obtained from the GENCODE repository (acces-
N 2 N 2
1 2
ℒrecon = γ1 ∑ ‖‖x1i − hî ‖‖ + γ2 ∑ ‖‖x2i − hî ‖‖ (13) sion no. GENCODE v32/Ensembl 98, https://www.gencodegenes.org/
i=1 F i=1 F human/release_32.html). The 10x Visium human lymph node data were
obtained from the GEO (accession no. GSE263617). The details of all
where x1 and x2 represent the original features of the modalities 1 and datasets used are available in the Methods. The data used as input to the
2, respectively. γ1 and γ2 are weight factors that are utilized to balance methods tested in this study, inclusive of the Stereo-CITE-seq and the
the contribution of different modalities. Owing to the differences of in-house human lymph node data, have been uploaded to Zenodo and
sequencing technologies and molecular types, the feature distribu- are freely available at https://doi.org/10.5281/zenodo.7879713 (ref. 27).
tions of different omics assays can vary substantially. As such, the
weight factors also vary between different spatial multi-omics tech- Code availability
nologies but are fixed for datasets obtained using the same omics An open-source Python implementation of the SpatialGlue toolkit
technology. is accessible at https://github.com/JinmiaoChenLab/SpatialGlue/.
A separate Python implementation of the SpatialGlue_3M toolkit
Correspondence loss. While reconstruction loss can enforce the designed for triple omics integration is accessible at https://github.
learned latent representation to simultaneously capture the expression com/JinmiaoChenLab/SpatialGlue_3M/. The Jupyter notebooks for
information of different modality data, it does not guarantee that the reproducing the results in this paper are available at https://github.
representation manifolds are fully aligned across modalities. To deal com/JinmiaoChenLab/SpatialGlue_notebook/.
with the issue, we add a correspondence loss function. Correspondence
loss aims to force consistency between a modality-specific representa- References
tion Y and its corresponding representation Ŷ obtained through the 25. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell
decoder–encoder of another modality. Mathematically, the corre- gene expression data analysis. Genome Biol. 19, 15 (2018).
spondence loss is defined in equations (14–16): 26. Kipf, T. N. & Welling, M. Semi-supervised classification with
graph convolutional networks. In 5th International Conference on
N 2 N 2
ℒcorr = γ3 ∑ ‖‖y1i − yî ‖‖ + γ4 ∑ ‖‖y2i − yî ‖‖
1 2
(14) Learning Representations (ICLR, 2017).
i=1 F i=1 F
27. longyahui. JinmiaoChenLab/SpatialGlue: SpatialGlue. Zenodo
https://doi.org/10.5281/zenodo.7879713 (2023).
l
Y1̂ = σ (As̃ (σ (As̃ Y1l−1 Wl−1
d2
+ bl−1
d2
)) Wl−1
e2
+ bl−1
e2
) (15)
Acknowledgements
l We thank Y. Tan for assistance in interpreting mouse thymus data,
Y2̂ = σ (As̃ (σ (As̃ Yl−1
2
Wl−1
d1
+ bl−1
d1
)) Wl−1
e1
l−1
+ be1 ) (16) M. Wu for comments on the model and T. Watson for assistance

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

in submitting in-house data to the Gene Expression Omnibus N.H. annotated and interpreted the mouse thymus dataset. L.V.O.
(GEO) database. The research was supported by: A*STAR under and I.K. annotated the human lymph node datasets. D.G. and L.V.O.
its BMRC Central Research Fund (CRF, UIBR) Award; AI, Analytics generated the human lymph node dataset. S.L., Y.H., M.J., A.C. and
and Informatics (AI3) Horizontal Technology Programme Office X.X. generated the mouse thymus dataset.
(HTPO) seed grant (Spatial transcriptomics ST in conjunction
with graph neural networks for cell–cell interaction C211118015) Competing interests
from A*STAR, Singapore; Open Fund Individual Research Grant The authors declare no competing interests.
(mapping hematopoietic lineages of healthy patients and high-risk
patients with acute myeloid leukemia with FLT3-ITD mutations Additional information
using single-cell omics no. OFIRG18nov-0103) from the Ministry of Extended data is available for this paper at
Health, Singapore; National Research Foundation (NRF), award no. https://doi.org/10.1038/s41592-024-02316-4.
NRF-CRP26-2021-0001; the National Research Foundation, Singapore,
and Singapore Ministry of Health’s National Medical Research Supplementary information The online version contains supplementary
Council under its Open Fund-Large Collaborative Grant (‘OF-LCG’) material available at https://doi.org/10.1038/s41592-024-02316-4.
(MOH-OFLCG18May-0003); and Singapore National Medical Research
Council (NMRC/OFLCG/003/2018). L.G.N. was supported by the Correspondence and requests for materials should be addressed to
National Natural Science Foundation of China (grant 32270956) and Jinmiao Chen.
Shanghai Science and Technology Commission (grant 20JC1410100).
Peer review information Nature Methods thanks the anonymous
Author contributions reviewer(s) for their contribution to the peer review of this work.
J.C. conceptualized and supervised the project. Y.L. designed the Primary Handling Editor: Rita Strack, in collaboration with the Nature
model. Y.L. developed the SpatialGlue software. Y.L., K.S.A., and Methods team. Peer reviewer reports are available.
J.C. wrote the manuscript. Y.L., J.C., R.F., D.Z., R.S., S.C.Y., C.Z., H.X.
and K.S.A. performed the data analysis. C.Z. and R.S. ran the Seurat Reprints and permissions information is available at
WNN algorithm. Y.L. prepared the figures. J.C., N.R.J.G., L.G.N. and www.nature.com/reprints.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 1 | See next page for caption.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 1 | Analysis of parameter sensitivity and time complexity illustrate SpatialGlue’s sensitivity to input dimensionality. (f) Supervised metrics
with simulated and real data. (a) Comparison of clustering results with different on the clustering results. (g) Time complexity of SpatialGlue and competing
numbers of neighbors k to illustrate SpatialGlue’s sensitivity to parameters. methods (Seurat, totalVI, MultiVI, MOFA+, MEFISTO, scMM, StabMap). In the
(b) Supervised metrics on the clustering results. (c) Comparison of clustering experiment, we used murine spleen dataset with 4,697 spots for RNA & Protein
results with different number of PCs to illustrate SpatialGlue’s sensitivity to data, and mouse brain RNA ATAC P22 with 9,215 cells for RNA & ATAC data. (h)
input dimensionality. (d) Supervised metrics on the clustering results. (e) Time complexity of SpatialGlue with various principal components as inputs on
Comparison of clustering results with different numbers of GNN layers to mouse brain RNA ATAC P22.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 2 | Evaluation of SpatialGlue on simulated triple omics data. (a) ground truth and spatial plots of modalities 1, 2, 3, and SpatialGlue. (b) Density
distribution of simulated data modalities. (c) SpatialGlue’s between modality weights explaining the importance of each modality to each cluster. (d) Within-modality
weights for the importance of spatial and feature graphs.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 3 | See next page for caption.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 3 | Results for lymph node samples A1 (a-c) and D1(d-k). lower quartiles, and whiskers denote the 1.5× interquartile range. n = 6 clusters.
(a) SpatialGlue’s between-modality weight explaining the importance of each (g) Comparison of Jaccard Similarity scores. (h) Quantitative evaluation with
modality to each cluster for the lymph node A1 sample. (b) Within-modality six supervised metrics. (i) Box plots of six supervised metrics for clustering
weights for the RNA and protein modalities explaining the contributions of results with number of clusters ranging from 4 to 11. In the box plot, the center
the spatial and feature graphs to each cluster. (c) Quantitative evaluation of line denotes the median, box limits denote the upper and lower quartiles,
SpatialGlue and competing methods. (d) Ground truth for the lymph node D1 and whiskers denote the 1.5× interquartile range. n = 8 clustering results with
sample. (e) Spatial plots of RNA and protein data (left), and clustering results of different resolutions for each method. (j) Between-modality weights explaining
single-cell and spatial multi-omics methods, Seurat, totalVI, MultiVI, MOFA+, the importance of each modality to each cluster. (k) Within-modality weights for
MEFISTO, scMM, StabMap, and SpatialGlue. (f) Comparison of Moran’s I score. In the RNA and protein modalities explaining the contributions of the spatial and
the box plot, the center line denotes the median, box limits denote the upper and feature graphs to each cluster.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 4 | See next page for caption.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 4 | Additional results of the mouse brain P22 samples explaining the importance of the spatial and feature graphs to each cluster.
(spatial-ATAC-RNA-seq and spatial-CUT&Tag-RNA-seq, H3K27ac). (a) (d) Within-modality weights for the RNA and CUT&Tag (H3K27ac) modalities
Separate spatial plots of all clusters identified by SpatialGlue in the mouse brain explaining the importance of the spatial and feature graphs to each cluster.
P22 sample (spatial-ATAC-RNA-seq). (b) Separate spatial plots of all clusters (e) Modality weights of Seurat when applied to the spatial-ATAC-RNA-seq P22
identified by SpatialGlue in the mouse brain P22 sample (spatial-CUT&Tag-RNA- sample. (f) Modality weights of Seurat when applied to the spatial-CUT&Tag-
seq, H3K27ac). (c) Within-modality weights for the RNA and ATAC modalities RNA-seq (H3K27ac) sample.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 5 | See next page for caption.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 5 | Results of the mouse brain P22 sample acquired with cluster for each modality. (f) Comparison of spatial clustering using Seurat
RNA-seq and CUT&Tag-seq (H3K4me3). (a) Spatial plots of data modalities with 10 and 50 PC dimensions in the mouse brain P22 spatial-ATAC-RNA-seq
with unimodal clustering (left), and clustering results (right) from single-cell sample. (g) Comparison of SpatialGlue and its variants, that is, SpatialGlue
and spatial multi-omics integration methods, Seurat, MultiVI, MOFA+, scMM, without reconstruction loss (‘SpatialGlue w/o recon’) and SpatialGlue without
StabMap, and SpatialGlue. (b) Comparison of Moran’s I score. In the box plot, correspondence loss (‘SpatialGlue w/o corr’), in the mouse brain P22 spatial-
the center line denotes the median, box limits denote the upper and lower ATAC-RNA-seq sample. (h) Comparison of Moran’s I score of SpatialGlue and
quartiles, and whiskers denote the 1.5× interquartile range. n = 18 clusters. its two variants. In the box plot, the center line denotes the median, box limits
(c) Comparison of Jaccard Similarity scores. (d) Between-modality weights denote the upper and lower quartiles, and whiskers denote the 1.5× interquartile
explaining the importance of each modality to each cluster. (e) Within-modality range. n = 18 clusters.
weights explaining the contributions of the spatial and feature graphs to each

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 6 | Results of the mouse brain P22 sample acquired with and whiskers denote the 1.5× interquartile range. n = 18 clusters. (c) Comparison
RNA-seq and CUT&Tag-seq (H3K27me3). (a) Spatial plots of data modalities of Jaccard Similarity scores. (d) Between-modality weights explaining the
with unimodal clustering (left), and clustering results (right) from single-cell importance of each modality to each cluster. (e) Within-modality weights
and spatial multi-omics integration methods, Seurat, MultiVI, MOFA+, scMM, explaining the contributions of the spatial and feature graphs to each cluster for
StabMap, and SpatialGlue. (b) Comparison of Moran’s I score. In the box plot, the each modality.
center line denotes the median, box limits denote the upper and lower quartiles,

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 7 | Additional results for the mouse brain P22 sample (spatial-CUT&Tag-RNA-seq, H3K27ac). (a) Intensity plots of marker genes.
(b) Normalized gene activity scores from Zhang et al. (c) Peak-to-gene links plots.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 8 | Additional results for the mouse thymus 1 sample. contributions of the spatial and feature graphs to each cluster for each modality.
(a) dsDNA image. (b) Total mRNA counts. (c) Modality weight from Seurat when (e) Separate spatial plots of all clusters identified by SpatialGlue. (f) Expression of
applied to the sample. (d) Within-modality weights of SpatialGlue explaining the marker genes and proteins for each cell type.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 9 | See next page for caption.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 9 | Results for the mouse spleen replicate 1 sample. RNA and protein expression). (d) Neighborhood enrichment of cell type pairs.
(a) Spatial plots of SpatialGlue’s clusters together (left) and separate (right). (e) Cluster co-occurrence scores for each cluster at increasing distances.
(b) UMAP plots of the RNA and protein data modalities (left), and spatial plot of (f) Spatial plots of the RNA and protein data modalities. (g) Cross tabulation
SpatialGlue’s clusters (right). (c) Comparison of fraction of nearest neighbors heatmap of the clustering labels between the RNA and protein data.
metric for each annotated cluster calculated by the different modalities (original

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 10 | See next page for caption.

Nature Methods
Article https://doi.org/10.1038/s41592-024-02316-4

Extended Data Fig. 10 | Results for the mouse spleen replicate 1(a-c) and 2(d- MEFISTO, scMM, StabMap, and SpatialGlue. (e) Comparison of Moran’s I score. In
h) samples. (a) Cross tabulation heatmap for the number of clusters between the box plot, the center line denotes the median, box limits denote the upper and
the RNA and protein data. (b) Modality weights from Seurat. (c) Within-modality lower quartiles, and whiskers denote the 1.5× interquartile range. n = 5 clusters.
weights of SpatialGlue explaining the contributions of the spatial and feature (f) Comparison of Jaccard Similarity scores. (g) Between-modality weight
graphs to each cluster for each modality. (d) Spatial plots of data modalities explaining the importance of each modality to each cluster. (h) Within-modality
with unimodal clustering (left), and clustering results (right) from single-cell weights explaining the contributions of the spatial and feature graphs to each
and spatial multi-omics integration methods, Seurat, totalVI, MultiVI, MOFA+, cluster for each modality.

Nature Methods

You might also like