Cell type selection of spatial expression data by enrichment analysis
We use an enrichment based weighted least squares approach for deconvolution of spatial expression datasets. First, enrichment analysis using Parametric Analysis of Gene Set Enrichment (PAGE) method [22] is applied on spatial expression dataset as previously reported [17]. The marker genes can be identified via differential expression gene analysis of Giotto based on the scRNAseq data provided by users. Alternatively, users can also provide marker gene expression for each cell type for deconvolution. The number of cell-type specific marker genes is denoted by m. For each gene, we calculate the fold change as the ratio between its expression value at each spot and the mean expression of all spots. The mean fold change of the m marker genes is calculated and denoted as Sm. As background control, the mean and standard deviation of the fold change values across all genes are denoted as μ and δ, respectively. The enrichment score (ES) is defined as follows:
$$ \mathrm{ES}=\frac{\left({S}_m-\mu \right)\ast \sqrt{m}}{\delta } $$
Then, we binarize the enrichment matrix with the cutoff value of ES = 2 to select cell types that are likely to be present at each point.
Estimating cell type composition by using a weighted least squares approach
In previous work, we developed DWLS [9] for deconvolution of scRNAseq data. This method is extended here to deconvolve spatial transcriptomic data using the signature gene identification step described above. In short, DWLS uses a weighted least squares approach to infer cell-type composition, where the weight is selected to minimize the overall relative error rate. In addition, a damping constant d is used to enhance numerical stability, whose value is determined by using a cross-validation procedure. Here, we use the same sets of weights and damping constant across spots within same clusters to reduce technical variation. Finally, since the number of cells present at each spot is generally small, we perform another round deconvolution after removing those cell types that are predicted to present at a low frequency by imposing an additional thresholding (min frequency = 0.02 by default).
Coarse-grained spatial transcriptomic data for model performance evaluation
The somatosensory cortex seqFISH+ data were abstained from https://github.com/CaiGroup/seqFISH-PLUS. To simulate spot-like data, we defined the square with 500 pixels time 500 pixels (~ 51.5 μm) as one spot-like region. Then, average expression level was calculated for each spot-like region. Due to the small sample size, we only considered the 6 major clusters: excitatory neurons (eNeuron), inhibitory neurons (iNeuron), astrocytes, oligodendrocytes (Olig), microglia cells, and endothelial-mural cells (endo_mural).
Benchmark comparison among different methods
Coarse-grained seqFISH+ dataset was used for benchmarking the accuracy of different deconvolution methods, including spatialDWLS, MuSiC, RCTD, SPOTlight, and stereoscope. For each published method, the default parameter setting was used for comparison. If the users are required to set parameters manually, we used the values suggested in the vignettes of the corresponding software. Cell-type annotations for the original, single-cell resolution data were used as the ground-truth. All five methods used the same scRNA-seq dataset as a reference in deconvolution.
For spatialDWLS, we clustered the spot-like regions by using Leiden clustering as implemented in Giotto (Version 1.0.3) by using the following commands createNearestNetwork(dimensions_to_use = 1:10, k = 4) and doLeidenCluster(resolution = 0.4, n_iterations = 1000).
Then, marker genes of major clusters were identified by using the findMarkers_one_vs_all function with parameter setting: method = ‘gini’, expression_values = ‘normalized’. Top 100 ranked genes for each cell type were selected as marker genes. Average marker gene expression was calculated based on the cell type annotation of scRNA-seq. Then, deconvolution was applied by using the runDWLSDeconv function.
MuSiC [8] (version 0.1.1) was used for deconvolution by using whole single cell RNA-seq matrix. ExpressionSet classes were generated for both single cell RNA-seq (SC.eset) and spatial expression datasets (ST.eset). Then, cell type proportion was estimated by using music_prop(bulk.eset = ST.eset, sc.eset = SC.eset).
Then, to perform deconvolution by using SPOTlight [15] (version 0.1.0), signature genes were identified based on the major cell type annotation by using Seurat::FindAllMarkers(logfc.threshold = 1, min.pct = 0.9).
Deconvolution was performed by using spotlight_deconvolution(se_sc = SC, counts_spatial = ST, cluster_markers = cluster_markers_all, clust_vr = “label”).
Next, we used stereoscope [16] (version 0.2.0) for the deconvolution of simulated dataset. Deconvolution was performed with the parameter: stereoscope run -scc SC.tsv -scl cell_labels.tsv -stc ST.tsv -sce 5000.
Finally, we used RCTD [14] (version 1.1.0) to evaluate the cell type composition for simulated seqFISH+ dataset. Signature genes were identified by using “dgeToSeurat,” and then “create.RCTD” and “run.RCTD” were used to decompose the cell type composition. Finally, cell type percentage for each spot was calculated using the “sweep” function.
The computational efficiency of different methods was benchmarked by using the Visium brain dataset. All analyses were done on the same computer, which had Intel Xeon CPU E5-2650 2.00GHz and 380Gb memory. Of note, the Visium data cannot be used to evaluate accuracy because the ground-truth is not known.
Root mean square error (RMSE) calculation
Based on the cell type annotation of seqFISH+ dataset, we calculated the true cell type percentage for simulated spatial expression datasets. For a specific cell type, we divided spot-like regions into two groups based on the presence or absence of this cell type. RMSEs were calculated separately for these two groups.
Analysis of a spatial transcriptomic dataset from the mouse brain
The Visium dataset was obtained from the 10X Genomics website (https://support.10xgenomics.com/spatial-gene-expression/datasets/1.1.0/V1_Adult_Mouse_Brain), which corresponds to a coronal section of the mouse brain. Then, Giotto was used for data analysis as (http://www.spatialgiotto.com/giotto.visium.brain.html). Only spots within tissue were kept for further analysis. Then, we filtered out low quality spots and genes by using filterGiotto with the parameter: expression_threshold = 1, gene_det_in_min_cells = 50, min_det_genes_per_cell = 1000.
After normalization and highly variable gene calculation, we performed neighborhood analysis with parameter: createNearestNetwork (dimensions_to_use = 1:10, k = 15) and clustered spots with the parameter: doLeidenCluster(resolution = 0.4, n_iterations = 1000). Finally, we used marker genes and scRNA-seq reported in Zeisel et al. [18] to deconvolute the Visium dataset.
Analysis of a spatial transcriptomic dataset from developing human heart
The human heart spatial transcriptomics datasets were obtained from [20]. Then, we filtered out low quality spots and genes by using filterGiotto with the parameter: expression_threshold = 1, gene_det_in_min_cells = 10, min_det_genes_per_cell = 200.
After normalization and highly variable gene detection, we performed neighborhood analysis with theparameter: createNearestNetwork(dimensions_to_use = 1:10, k = 10) and clustered spots with the parameter: doLeidenCluster(resolution = 0.4, n_iterations = 1000).
In addition, we use the scRNA-seq data from the same website with spatial transcriptomics datasets. Based on the clusters reported, we re-analyzed signature genes by using Giotto with the parameter: findMarkers_one_vs_all(method = 'scran').
The average expression of marker genes was used for the deconvolution of heart ST datasets.
Assortativity analysis
To evaluate the degree of spatial coherence, we extended the assortativity analysis [21], a method commonly used in the network analysis to evaluate the tendency of similar networks nodes are connected to each other. Here, we generated a spatial network by connecting spots that are immediately next to each other. The assortativity coefficient represents the normalized deviation of edges connecting the same cell type than expected by chance. More precisely, it is defined by the following formula:
$$ Q=\frac{\sum_k{q}_{kk}-{\sum}_k{a}_k^2}{1-{\sum}_k{a}_k^2} $$
where
$$ {q}_{kk}=\frac{\sum_i{\sum}_j{w}_k^i{w}_k^j{e}_{ij}}{\sum_i{\sum}_j{e}_{ij}} $$
and
$$ {a}_k=\frac{1}{N}{\sum}_i{w}_k^i $$
In the above, \( {w}_k^i \) represents the fraction of cell-type k at the ith spot, N represents the total number of spots, and eij is defined as
$$ {e}_{ij}=\left\{\begin{array}{c}1,\mathrm{if}\ i\ \mathrm{and}\ j\ \mathrm{are}\ \mathrm{neighboring}\ \mathrm{spots}\\ {}0,\mathrm{otherwise}\end{array}\right. $$
If the values of \( {w}_k^i \) are binary, then the above definition reduces to the original formula in [21].