Aging of blood can be tracked by DNA methylation changes at just three CpG sites

Background Human aging is associated with DNA methylation changes at specific sites in the genome. These epigenetic modifications may be used to track donor age for forensic analysis or to estimate biological age. Results We perform a comprehensive analysis of methylation profiles to narrow down 102 age-related CpG sites in blood. We demonstrate that most of these age-associated methylation changes are reversed in induced pluripotent stem cells (iPSCs). Methylation levels at three age-related CpGs - located in the genes ITGA2B, ASPA and PDE4C - were subsequently analyzed by bisulfite pyrosequencing of 151 blood samples. This epigenetic aging signature facilitates age predictions with a mean absolute deviation from chronological age of less than 5 years. This precision is higher than age predictions based on telomere length. Variation of age predictions correlates moderately with clinical and lifestyle parameters supporting the notion that age-associated methylation changes are associated more with biological age than with chronological age. Furthermore, patients with acquired aplastic anemia or dyskeratosis congenita - two diseases associated with progressive bone marrow failure and severe telomere attrition - are predicted to be prematurely aged. Conclusions Our epigenetic aging signature provides a simple biomarker to estimate the state of aging in blood. Age-associated DNA methylation changes are counteracted in iPSCs. On the other hand, over-estimation of chronological age in bone marrow failure syndromes is indicative for exhaustion of the hematopoietic cell pool. Thus, epigenetic changes upon aging seem to reflect biological aging of blood.

. This data was generated on the HumanMethylation 450k BeadChip which comprised 99 of our 102 AR-CpGs. Notably, most of the features on this platform are based on a slightly modified DNAm assay (Infinium II beadtype) [9]. (c) When we calculated age-predictions based on 99 CpG sites, they clearly correlated with chronological age (grey dots). However, the linear offset indicated a systematic bias which might be due to the three missing CpG sites or to the different assay design of the two microarray platforms. A multivariate model based on the 99 CpGs of the 450k BeadChip facilitated reliable age predictions (MAD: 4.12 years; black) indicating that these CpGs are clearly age-associated. MAD = mean absolute deviation.

Figure S6: Flowchart for selection of the Epigenetic-Aging-Signature.
Selection of three CpG sites from 102 age-associated pre-filtered loci for pyrosequencing. IQR = interquartile range.

Figure S7: DNAm level at CpGs in the neighborhood of 5 AR-CpG sites.
For the five top AR-CpGs we analyzed DNAm at 10 up-and down-stream located CpG sites on the 450k BeadChip. DNAm was analyzed in blood samples of different age-groups [8] and in relation to embryonic stem cells [11]. The distance to the corresponding AR-CpG site is depicted in base pairs. Grey arrows indicate orientation of the genes. AR-DNAm changes were often also observed in surrounding CpGs but they are not necessarily associated with promoter regions. Notably, the trend of AR-DNAm changes towards younger donors continues in ESCs in each of these genomic regions. Figure S8: Gene expression of selected genes with age-related CpG sites. Gene expression profiles of whole blood samples from the Leiden Longevity Study (150 samples; GSE16717) [12] were analyzed for the top 5 AR-CpG loci. ITGA2B and ASPA were hardly expressed, which is indicated in red. Overall, DNAm changes upon aging were hardly reflected in expression of these genes.

Figure S9: DNAm level in different blood subsets.
Results of the Epigenetic-Aging-Signature might be due to myeloid skewing -a phenomenon particularly observed in the elderly [13]. (a) To this end, DNAm profiles of various blood cell types from a dataset of Reinius and coworkers (GSE35069) [14] were analyzed for the three AR-GpG sites of the Epigenetic-Aging-Signature (n=6; age 38±13.6 years). DNAm at cg02228185 depicted a constant level for all blood subsets. Myeloid cells revealed a slightly lower methylation level than lymphoid cells for the CpG sites cg25809905 and cg17861230. As cg17861230 is hypermethylated upon aging ( Figure 2d) these results indicate that AR-DNAm is not due to myeloid skewing. (b) In addition, age predictions were calculated on available DNAm profiles of different cell types in peripheral blood (PB) and cord blood (CB) (GSE20242 [2] and E-MTAB-487 [15]). For this analysis the multivariate model, based on 3 CpG sites of pyrosequencing data, had to be adopted because the Illumina BeadChips do not cover the neighboring CpG site of cg17861230 (Figure 2c). To this end, we have recalculated the multivariate model based on the initial 575 DNAm profiles (Table S1; GSE20242 has also been included in this training-set but the results are very similar if it would be excluded for this analysis): cg02228185 = α, cg25809905 = β and cg17861230 = γ. The formula for age prediction is: predicted age [in years] = 96.6 -56.8 α -64.2 β + 90.5 γ. Overall, age prediction was feasible in all data of isolated hematopoietic subsets that we analyzed -even though the deviation was larger than in pyrosequencing data of whole blood. CB samples were always predicted close to 0. Thus, AR-DNAm changes at the three CpGs cannot only be attributed to changing composition of blood upon aging. Figure S10: Influence of blood cell composition on age prediction. Lam and coworkers provided DNAm profiles of 99 blood samples (age 24 -45) including information on blood counts (GSE37008) [6]. (a) In this age range blood counts revealed only very moderate changes upon aging -but there was high variation in the percentage of white blood cells (wbc) between individual samples. (b) To further analyze, if the composition of specific cell types has major impact on agepredictions we applied the multivariate model based on 3 CpG sites (multivariate model of Figure S9 for Illumina BeadChip data). Overall, age-predictions were overestimated in this dataset (MAD = 7.57 years) which might be due to technical differences by inter-comparison of different studies. Increased percentages of monocytes (c), lymphocytes (d), neutrophils (e), basophils (f), or eosinophils (g) did not reveal a systematic effect on age prediction.

Figure S11: Effect of clinical and lifestyle parameters on age-predictions.
The deviation of predicted and chronological age (residuals) was correlated to the following parameters in 105 samples from the HNR-study: years of education (a; P = 0.37) [16], coronary-artery calcium (CAC) score (b; P = 0.79) [17], sports (c; no sports or at least once a week; P = 0.60) or physical activity (d; P = 0.62) [18], smoker-status (e; P = 0.33) and packyears (f; P = 0.89), cholesterol level (g; P = 0.82) and systolic blood pressure (h; P = 0.34) on the age-prediction model. None of the presented parameters was significantly associated with regard to the deviation of predicted and chronological age. Telomere length in relation to age-adjusted mean -given as delta telomere length -is presented for 104 healthy controls, 15 patients with aplastic anemia (AA) and 5 patients with dyskeratosis congenita (DKC). Significance was calculated by two-sided Student´s T-test. MAD = mean absolute deviation of predicted and chronological age.