Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: Comparison and evaluation of statistical error models for scRNA-seq

Fig. 1

Shallow sequencing masks overdispersion in scRNA-seq data. A Proportion of genes that fail a goodness-of-fit test for a Poisson GLM (see the “Methods” section), as a function of gene abundance, for 59 scRNA-seq datasets. For visual clarity, both the color and diameter of each dot correspond to the fraction of genes that exhibit overdispersion. Y-axis represents non-cumulative gene abundance bins between two consecutive labels (for example, >1 refers to all genes with average abundance >1 UMI and ≤5 UMI). Values are listed in Additional file 1: Table S2. B Relationship between average gene abundance and quantile residual variance, after applying a Poisson GLM (see the “Methods” section). Results are shown for datasets profiling endogenous RNA (’‘technical controls”), a HEK293 cell line (’‘biological controls”), and human PBMC (’heterogeneous’). C In datasets profiling cell lines, the fraction of genes that exhibit overdispersion is correlated with average sequencing depth. D Distribution of molecular counts for highly expressed genes in the PBMC Smart-seq3 dataset after downsampling to two different sequencing depths. The expected density assuming a Poisson distribution is shown in red. E Same as (B) but after downsampling the PBMC Smart-seq3 dataset to five different sequencing depths

Back to article page