Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data

Fig. 2

MultiK identified K = 3 and 7 as optimal in the 3 cell line mixture dataset. a Bar plot of the frequency of runs for each K across a total of 4000 subsampling runs. The numbers above each vertical bar represent the frequency of runs. The dotted line is the frequency 100 threshold. b Plot of relative PAC score for each K. The X-axis represents K. The Y-axis represents the relative PAC score. c Scatterplot of (1 - rPAC) and the frequency of K. Each dot represents a level of K from a and b. The thin dotted line connects each K to show the ordering. The fat solid line connects the candidates for optimal K. d, e Left is 2-d UMAP plot of the Louvain clustering from Seurat (version 3.1.2) on the full data matrix at the suggested candidate K level. Each dot represents a single cell and each color represents a cluster at a given K level. Right is the pairwise SigClust result mapped onto the cluster mean dendrogram at suggested candidate K levels. Closed circles on the nodes indicate significant SigClust p values (< 0.05) suggesting a cluster is a class, whereas open circles on the nodes indicate non-significant SigClust p values (≥ 0.05) suggesting a cluster is a subclass. Fibro=fibroblast

Back to article page