Skip to main content
Fig. 3 | Genome Biology

Fig. 3

From: MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data

Fig. 3

MultiK identified K = 8 and 18 as optimal in the FVB3-mammary dataset. a Bar plot of the frequency of runs for each K across a total of 4000 subsampling runs. The numbers above each vertical bar represent the frequency of runs. The dotted line is the frequency 100 threshold. b Plot of relative PAC score for each K. The X-axis represents K. The Y-axis represents the relative PAC score. c Scatterplot of (1 - rPAC) and the frequency of runs for each K. Each dot represents a level of K from a and b. The thin dotted line connects each K to show the ordering. The fat solid line connects the candidates for optimal K. d, e Left is 2-d UMAP plot of the Louvain clustering from Seurat (version 3.1.2) on the full data matrix at the suggested candidate K levels. Each dot represents a single cell and each color represents a cluster at a given K level. Right is the pairwise SigClust result mapped onto the cluster mean dendrogram at suggested candidate K levels. Closed circles on the nodes indicate significant SigClust p values (< 0.05) suggesting a cluster is a class, whereas open circles on the nodes indicate non-significant SigClust p values (≥ 0.05) suggesting a cluster is a subclass. Fibro=fibroblast, Epi=epithelial, Endo=endothelial, SM=smooth muscle, Lum Pro=luminal progenitor, Mature Lum=mature luminal

Back to article page