Skip to main content
Fig. 5 | Genome Biology

Fig. 5

From: Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications

Fig. 5

False positive control on mock null Usoskin datasets (n=622 cells). a The scatterplot and GLM fits (R glm function with family=binomial), color-coded by batch (i.e., picking sessions Cold, RT-1, and RT-2), illustrate the association of zero abundance with sequencing depth. The three batches differ in their sequencing depths, causing an attenuated global relationship when pooling cells across batches (blue curve). Adjusting for the batch effect in the ZINB-WaVE model allows us to account for the relationship between sequencing depth and zero abundance properly. b Histogram of ZINB-WaVE weights for zero counts for original Usoskin dataset, with (white) and without (green) including batch as a covariate in the ZINB-WaVE model. The higher mode near zero for batch adjustment indicates that more counts are classified as dropouts, suggesting a more informative discrimination between excess and negative binomial zeros. c Box plot of per-comparison error rate (PCER) for 30 mock null datasets for each of seven differential expression methods. ZINB-WaVE-weighted methods are highlighted in blue. d Histogram of unadjusted p-values for one of the datasets in (c). ZINB-WaVE was fitted with the intercept, cell-type covariate (actual or mock), and batch covariate (unless specified otherwise) in X, V=1 J , K=0 for W, common dispersion, and ε=1012. GLM generalized linear model, PCER per-comparison error rate

Back to article page