Illustration of how blood composition drives observed age differences. (A) Heatmap of the cell sorted data shows very clear and consistent DNAm profiles for each cell type. We show 600 probes selected for estimating composition proportions used to demonstrate differences here. (B) To simplify the illustration we selected a section of (A) displaying only the two most abundant cell types: CD4+ T cells and granulocytes. (C) Heatmap of a randomly selected sample of 30 whole blood samples (from the data in Additional file 1) across three age groups (10 per group): between 1 and 5 years of age, between 30 and 40, greater than 60 years. The same probes as in (B) are used. When the samples are ordered by their estimated granulocyte proportion, the samples roughly cluster by age and a similar pattern to (B) is observed. The estimated cell count proportions for each of the samples are shown below. Note the strong confounding between age and cell composition. (D) For the two samples highlighted with an arrow in (C), we show how a weighted average of the cell type profiles can reconstruct the observed DNAm profiles. The numbers shown are the estimated proportions. Note how different weights (cell counts) for old and young result in very different observed DNAm patterns. Note that the differences in CD4+ T cells and granulocytes drive much of the differences in DNAm. NK, CD56+ natural killer cells; CD8T, CD8+ T cells; CD4T, CD4+ T cells, Gran, granulocytes; Bcell, CD19+ B cells; Mono, CD14+ monocytes; DNAm, proportion of DNA methylation at individual CpGs (Illumina 'beta' values, bound between 0 and 1); Prop, cell count proportion, between 0 and 1 for each component, such that they sum to 1.