Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: CellSIUS provides sensitive and specific detection of rare cell populations from complex single-cell RNA-seq data

Fig. 4

CellSIUS benchmarking on cell line data. a Schematic overview of dataset perturbations. Starting from a dataset containing three cell types (abundant cell type 1, abundant cell type 2, and rare cell type), we first generated a defined number of rare cells by subsampling. In addition, we partitioned the type 2 cells in two, leaving out 25 cells from the dataset for later use. Next, we adjusted the subtlety of the transcriptional difference between the rare cells and their closest neighbor (cell type 2) by swapping a fraction of gene expression values in the type 2 cells with the corresponding value in the left-out rare cells. We then pre-defined an initial cluster assignment as cluster 1 = type 1, cluster 2 = the union of type 2 and rare cells and assessed whether different algorithms for detecting rare cell types are able to correctly classify the rare cells as such. b, c Comparison of CellSIUS to GiniClust2 and RaceID3 for varying incidence of the rare cell type and varying subtlety of the transcriptional signature here, we used 100 HEK293 cells as type 1, 100 Ramos cells as type 2, and up to 10 Jurkat cells as the rare cell type and we swapped between 0 and 99.5% of gene expression values. For each algorithm, we assessed the recall (b), i.e., the fraction of correctly identified rare cells, and precision (c), i.e., the probability that a cell which is classified as rare is actually a rare cell. d tSNE projection of subset 2 of the cell line dataset, colored by CellSIUS assignment. Cluster numbers correspond to the main clusters identified by MCL, clusters labeled x.sub indicate the CellSIUS subgroups. Symbols correspond to the cell line annotation. e Violin plot showing the main markers identified by CellSIUS, grouped by cluster

Back to article page