Mapping and validating distal regulatory sites using DNA methylation. (A)Mapping strategy: a model for methylation-expression relationships in gene promoters was applied to VMSs from 1 Mb upstream through 1 Mb downstream of 17,862 genes. (B)Distribution of methylation-versus-expression levels for high-scoring gene-CpG pairs (score ≥0.9, n = 2,824). (C)Relative enrichment of chromatin factors around the high-scoring methylation sites (n = 1,911), excluding sites in the promoters of the associated genes. Data were normalized to 0 to 1 scale.(D) Fold enrichment of methylation sites (n = 2,824) in actual gene intervals (real), versus the null expectation based on random permutations of gene expression data (random), of chromatin states defined by the ChromHMM algorithm .(E)Left: Number of transcription factors binding to the high-scoring sites, compared with random expectations. Right:Number of transcription factors binding to unmethylated or methylated enhancers, compared with random expectations. Averages of four cell types for which methylation and binding data were available (GM12878, HepG2, HeLaS3, K562) are shown.Sites in the promoters of the associated genes were excluded.(F) Evolutionary sequence conservation around the top-scoring sites. The analyses shown in D and E excluded all sites at ±5 kb from TSSs. TF, transcription factors.