Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: SCA: recovering single-cell heterogeneity through information-based dimensionality reduction

Fig. 1

a Illustration of SCA’s key conceptual advance. The vertical axis separates a small cellular population (top) from a larger one (bottom). The two horizontal axes have higher variance but cannot separate the two populations. The leading principal components align with the higher-variance horizontal axes and fail to separate the populations. The leading surprisal component aligns with the more informative vertical axis, allowing downstream separation. b Construction of surprisal scores from gene expression data. For each cell, we compare the gene’s expression in a local neighborhood of the cell to the gene’s global expression using the Wilcoxon rank-sum test. The resulting p-values are negative log-transformed to give the “surprisal” of the observed over- or under-expression, and given a positive sign for over-expression and a negative sign for under-expression. c Surprisal scores of the ITGAL gene over a set of 3000 PBMCs profiled via Smart-seq 3 [14]. Scores are positive where the gene is locally enriched, near zero where it represents noise, and negative where it is conspicuously absent. d Construction of surprisal components. Surprisal scores undergo singular value decomposition (SVD) over all genes to yield D loading vectors that capture informative axes in the data. We then linearly project the input transcript count matrix to these axes, producing a D-dimensional representation of each cell for downstream analysis

Back to article page