Skip to main content
Fig. 6 | Genome Biology

Fig. 6

From: EPISCORE: cell type deconvolution of bulk tissue DNA methylomes from single-cell RNA-Seq data

Fig. 6

Proof of EPISCORE concept in peripheral blood. a Dimensional reduction and t-SNE embedding of a 10X dataset profiling 68,000 peripheral blood mononuclear cells (PBMCs), depicting the resulting cell clusters, annotated to PBMC cell types using a bulk RNA-Seq reference matrix for PBMCs. b The expression reference matrix over 5 main PBMC cell subtypes and 96 marker genes, as constructed from the scRNA-Seq data in a. c Two examples of marker genes, where differential expression across blood cell subtypes (as assessed using independent bulk RNA-Seq and DNAm data of purified blood cell types) is associated with corresponding differential DNAm at a CpG mapping to a distal regulatory element of the gene. The y-axis labels DNAm at this CpG, x-axis labels the normalized gene expression value, and datapoints are shown for all matching blood cell subtypes. A fit from a logistic regression is shown, which is used to impute DNAm at this CpG from the expression values in the scRNA-Seq reference matrix, after suitably normalizing the expression data. d Heatmaps displaying the imputed DNAm reference matrix for 71 marker CpGs (top panel) and the corresponding scRNA-Seq reference matrix for X marker genes associated with these 71 CpGs. e Validation of the imputed DNAm reference matrix using in silico mixtures of PBMC cell types, using Illumina 450k profiles from Reinius et al. For each of the 5 PBMC cell types, we plot the estimated fractions (x-axis) vs. the true fractions (y-axis), for each of 100 in silico mixtures. Mixture weights were simulated from a uniform Dirichlet distribution. PCC values are given. f As e, but with mixture weights drawn from a non-uniform Dirichlet distribution, with mean and variance for each cell type matched to observed data

Back to article page