Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data

Fig. 4

t-SNE embeddings of the organogenesis dataset. All panels show t-SNE embeddings of the organogenesis dataset [39] (2,058,652 cells), colored by the 38 main clusters identified by the original authors. All panels use 2,000 genes with the largest Pearson residual variance. Each panel shows a total of 2,026,641 cells, excluding 32,011 putative doublets identified in the original paper. All t-SNE embeddings were done with exaggeration 4 [36, 40]. a Depth normalization, median scaling, log-transformation and PCA with 50 principal components. b Same as in a, but with an additional standardization step that scales the normalized and log-transformed expression of each gene to mean zero and unit variance, as in the original paper [39]. c GLM-PCA with 50 dimensions (NB model with shared overdispersion as a free parameter, estimated to be \(\hat {\theta }=0.56\)). d Analytic Pearson residuals with θ=100 and PCA with 50 principal components. The scattered small islands do not belong to single clusters but instead are spuriously enriched in single embryos. e Same as in d, but after removing batch effect genes (“Methods”). Text labels correspond to the developmental trajectories identified in the original paper [39] (uppercase: multi-cluster trajectories, lowercase: single-cluster trajectories)

Back to article page