Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: pipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools

Fig. 4

Evaluation of normalization procedures. a Variance in PC1 explained by library size (left) and detection rate (right) after accounting for subpopulation differences. b Average silhouette width per true subpopulation, where higher silhouette width means a higher separability. c Clustering accuracy (Seurat clustering), measured by mutual information (MI), Adjusted Rand Index (ARI), and ARI at the true number of cluster. Gray squares indicate that the true number of clusters could not be reached with the corresponding method, dataset, and parameter set. “feat_regress,” “mt_regress,” and “feat_mt_regress” respectively stand for the regressing out of the number of features, the proportion of mitochondrial reads, or both during scaling. Throughout this paper, silhouette plots (b) square root transform values for color mapping. Other evaluation metric heatmaps (e.g., c here) map colors to the signed square root of the number of (matrix-wise) median absolute deviations from the (column-wise) median. Using this mapping, differences in color have the same meaning across datasets, and the color scale is not primarily capturing outliers or baseline differences between datasets. The numbers printed in the cells represent the raw (i.e., unscaled) metric values and are printed in white or black depending on whether they are above or below the median

Back to article page