Fig. 2

Heatmaps of gene correlation matrices estimated from real data and synthetic data generated by scDesign2, its variant without copula, ZINB-WaVE, and SPARSim. a, b Pearson correlation matrices. c, d Kendall’s tau matrices. In a and c, training and test data contain goblet cells measured by 10x Genomics [83]. In b and d, training and test data contain cells of dendrocytes subtype 1 (DC1) measured by Smart-Seq2 [4]. For each cell type, the Pearson correlation matrices and Kendall’s tau matrices are shown for the 100 genes with the highest mean expression values in the test data; the rows and columns (i.e., genes) of all the matrices are ordered by the complete-linkage hierarchical clustering of genes (using Pearson correlation as the similarity in a and b and Kendall’s tau in c and d) in the test data. We find that the correlation matrices estimated from the sythetic data generated by scDesign2 most resemble those of training and test data