Tissue of origin determines cancer-associated CpG island promoter hypermethylation patterns

Background Aberrant CpG island promoter DNA hypermethylation is frequently observed in cancer and is believed to contribute to tumor progression by silencing the expression of tumor suppressor genes. Previously, we observed that promoter hypermethylation in breast cancer reflects cell lineage rather than tumor progression and occurs at genes that are already repressed in a lineage-specific manner. To investigate the generality of our observation we analyzed the methylation profiles of 1,154 cancers from 7 different tissue types. Results We find that 1,009 genes are prone to hypermethylation in these 7 types of cancer. Nearly half of these genes varied in their susceptibility to hypermethylation between different cancer types. We show that the expression status of hypermethylation prone genes in the originator tissue determines their propensity to become hypermethylated in cancer; specifically, genes that are normally repressed in a tissue are prone to hypermethylation in cancers derived from that tissue. We also show that the promoter regions of hypermethylation-prone genes are depleted of repetitive elements and that DNA sequence around the same promoters is evolutionarily conserved. We propose that these two characteristics reflect tissue-specific gene promoter architecture regulating the expression of these hypermethylation prone genes in normal tissues. Conclusions As aberrantly hypermethylated genes are already repressed in pre-cancerous tissue, we suggest that their hypermethylation does not directly contribute to cancer development via silencing. Instead aberrant hypermethylation reflects developmental history and the perturbation of epigenetic mechanisms maintaining these repressed promoters in a hypomethylated state in normal cells.

: Promoter hypermethylation frequencies at known tumour suppressor genes Figure S1: Methylation levels at hypermethylation prone genes vary between cancer types   Table S1 Promoter hypermethylation frequencies at known tumour suppressor genes We created a list of known tumour suppressor genes which when mutated in the germline are associated with pre-disposition to cancer. This was then limited to genes which had probes within CGIs and within 200bp of TSSs. Reported are these probe IDs and the % of tumours in each tissue which were methylated (mean beta > 0.3). Br=Breast, Col=Colorectal, Pro=Prostate, Glio=Glioblastoma, Lung=Lung, Ovr=Ovarian.

Figure S1
A  Figure S1 Methylation levels at hypermethylation prone genes vary between cancer types A. Numbers of frequently hypermethylated genes vary between tumour types. Shown is a bargraph of the number of frequently hypermethylated genes found in each of the 7 tumour types analysed.
B. Methylation levels at hypermethylation-prone genes vary between tumour types. Shown is a boxplot of the median methylation levels found at the 1009 hypermethylation prone genes in the 7 tumour types analysed.

Genes frequently hypermethylated in multiple cancer types have regulated expression patterns in normal tissues
Histograms showing the distribution of tissue-specificity scores observed for different gene sets. Specificity scores for different gene sets were compared using a Wilcoxon rank sum test as indicated. (*** < 0.001). All as Figure 2A but using alternative gene sets.
A. Genes prone to hypermethylation in multiple cancer types. Gene sets are as Figure 2A but specificity scores were calculated from microarray expression data rather than RNA-seq.
B. Genes prone to hypermethylation as defined using alternative parameters. On the left the threshold used to define genes as hypermethylated was varied (see methods for details). On the right the frequency of hypermethylation required to be defined as frequently hypermethylated in a given cancer was varied.
C. Genes frequently hypermethylated in each of the individual cancer types examined in this study. Methylation resistant genes were defined as genes never methylated in that cancer type.

Repeat densities and evolutionary conservation do not determine hypermethylation susceptibility in cancer
A. Transcriptional start sites are depleted of repetitive elements. Shown are graphs of the frequency of LINEs, SINEs and LTRs at 1Kb intervals around CGI or non-CGI TSSs (as Figure  3A).
B. Hypermethylation prone promoter regions are evolutionarily conserved. Shown are graphs of the level of conservation found in 500bp intervals around genes that are hypermethylated in colorectal tumours as defined by alternative profiling methods (as Figure 3B) [34][35][36]. The significance of observed differences between hypermethylated and non-hypermethylated genes was assessed using a Wilcoxon rank sum test for the scores -/+ 2Kb from the TSSs (*** p < 0.001).
C. Repeat densities do not determine hypermethylation susceptibility. Shown are heatmaps indicating the presence (red) or absence (white) of repeats in the region around hypermethylation prone and resistant TSSs. Repeat presence was assessed in 1Kb intervals -/+5Kb from each TSS and TSSs are ordered by their repeat density in this region.
D. Evolutionary conservation does not determine hypermethylation susceptibility. Shown are boxplots of the degree of evolutionary conservation found -/+2Kb from hypermethylation prone and resistant TSSs (measured as Figure 3B). The significance of differences between the distributions was assessed using Wilcoxon rank sum tests.