Deep C diving: mapping the low-abundance modifications of the DNA demethylation pathway
© BioMed Central Ltd 2013
Published: 29 May 2013
Skip to main content
© BioMed Central Ltd 2013
Published: 29 May 2013
Two new studies imply that the reprogramming of 5-methylcytosine via TET- and TDG-family enzymes is both widespread throughout the genome and functionally significant.
Genome-wide patterns of 5mC and now 5hmC are becoming well characterized in a host of cell and tissue types with ever increasing complexity, ultimately driven by a host of recent technological advances [3, 4]. In contrast, there is a distinct lack of understanding regarding the distributions of the 5fC and 5caC modified sequences, largely due to a lack of accurate methods to detect these low-abundance modifications; mass spectroscopy indicates that 5fC is at 2% and 5caC at 0.5% of the levels of 5hmC in mouse embryonic stem cells (mESCs), which in turn is only 4% as abundant as 5mC . Two recent studies report on novel techniques for mapping both 5fC and 5caC modifications, as well as addressing the functionality of TET/TDG 5mC oxidation events that occur throughout the genome [6, 7]. Through the use of highly specific antibodies raised against both 5fC and 5caC, researchers led by Yi Zhang at Harvard University are able to map the genome-wide distributions of both derivatives of 5hmC . In an analogous set of experiments, Song and colleagues from the laboratory of Chuan He at the University of Chicago expand upon their already successful chemical capture techniques to enrich for 5hmC-marked DNA . In short, by first modifying all endogenous 5hmC by glucosylation, they can then specifically reduce 5fC-marked cytosines to 5hmC through the addition of sodium borohydride (NaBH4) and then glycosylate these sites with a modified glucose group (6-azide-glucose) to which a disulfide biotin linker is attached for subsequent enrichment. In addition, the group also adapt techniques to visualize the 5fC modification at single-base resolution (fCAB-seq), overcoming the issues of discrimination between the modified forms of cytosine that arise in traditional bisulfite-based mapping. By employing these novel techniques, both studies report the genome-wide patterns of 5fC, in addition to 5caC in the Zhang study, in wild-type (WT) mESCs [6, 7]. Typically sequence reads for both modifications are small in number in WT mESCs, consistent with a low abundance, but there is a suggestion of moderate levels of 5fC at repeat regions. The overall genomic distribution of 5fC and 5caC appears to be distinct from 5hmC in WT cells , but this view should be interpreted with caution due to the relatively fewer number of reads for 5fC and 5caC compared with 5hmC. Both studies recognized that enhancement of 5fC and 5caC levels in cells would improve data interpretation, so they derived similar biological strategies to improve the signal-to-noise ratio for their respective assay systems.
As the 5fC and 5caC derivatives are believed to be committed for rapid removal by base excision repair-mediated mechanisms involving the protein TDG, the patterns of these two marks at steady state may not accurately reflect where demethylation is dynamically occurring in WT cells. To solve this problem the TDG protein was reduced to low levels either by short hairpin RNA interference  or through genetic manipulation in mESCs , to allow for the accumulation of both demethylation intermediates following TET-mediated oxidation of 5mC and 5hmC (Figure 1b). This increased the absolute levels of each modification and enhanced data quality and interpretation. Upon loss of TDG activity, many ectopic regions of 5fC and 5caC become apparent over genic and promoter-proximal regions; this contrasts with an earlier study that found 5fC enrichment in CpG islands (CGIs) of promoters and exons using a different assay technique . The earlier study suggested CGI promoters, in which 5fC was relatively more enriched compared with 5mC or 5hmC, corresponded to transcriptionally active genes. In the present studies, upon relating the TDG-mediated changes of 5fC and 5caC to the transcriptional activities of associated genes, both groups suggest that TDG-mediated 5fC/5caC excision occurs preferentially at transcriptionally inactive promoters, implying a potential inhibitory role for the oxidative products at promoter proximal regions. No doubt these differing views will be amicably resolved in the future.
Many of the ectopic 5fC and 5caC peaks were found to correspond to regions bound by transcription factors such as Oct4 and Nanog, which themselves play key roles in the maintenance of pluripotency, as well as at sites of Polycomb-group protein binding. These results imply that TET/TDG-mediated 5mC oxidation may be a key event in the targeting of chromatin modifying proteins and transcription factors to specific loci. Interestingly, both of the studies report that upon TDG reduction/removal, the majority of ectopic 5fC and 5caC is found at non-repetitive regions of the genome outside of promoters and exons, particularly over enhancer elements. After inhibition of TDG activity, the genomic distribution patterns of 5caC and 5fC are comparable with that of 5hmC, which was not so obvious for the WT cells . Closer analysis reveals a strong enrichment for both 5fC and 5caC at poised (H3K4me1 but not H3K27ac marked) enhancer elements, implying that 5mC oxidation may be crucial for the priming of such regulatory regions. Comparison to transcription factor binding site data indicated that TDG-dependent regulation of 5fC occurs preferentially at Tet1-, Tet2-, p300- and CTCF-binding regions in mESCs .
As TET/TDG-mediated changes to cytosine modification states have now been shown to occur over a large number of genes and regulatory elements, this work reveals the potential for active DNA demethylation throughout the genome. Functionally, it is difficult to interpret how such modifications affect the overall epigenomic and transcriptomic landscape of the cells. The relationship between transcriptional state and DNA demethylation appears to be a complex affair. Upon depletion of TDG, only a small proportion of genes actually change in their expression state (99 genes with P-values <0.01 and a fold change >1.5-fold; or 1,192 genes with P-values <0.01 alone). In contrast, relative global changes in the levels of both 5fC and 5caC are extensive. Mass spectrometry analysis indicates that global levels of 5fC and 5caC increased by 5.6-fold and 8.4-fold, respectively, in response to TDG knockdown; 5mC and 5hmC levels were not altered . Furthermore, ectopic peaks of 5fC and 5caC accumulate outside of promoters and enhancers, such as those at the 3′ ends of genes, at sites that do not align to annotated regions of TDG binding . As such, other proteins may be able to facilitate the base excision repair of the oxidative products of 5mC/5hmC in the absence of TDG.
In view of the low levels of these marks, it is impressive how comparable many of the conclusions are between the two studies, particularly as antibody-based methods of enrichment on low-abundance proteins and DNA modifications are challenging when compared with chemical capture based techniques (Figure 1b). Although semiquantitative, the relative enrichments of the modifications (particularly in TDG-depleted/knock-out cell lines) suggest that the marks may either be snapshots of active demethylation at key regulatory regions or 'memories' of recent transcription events. The impression is of a poised environment that is permissive for rapid transcription upon the binding of relevant factors, a feature that would be highly relevant to pluripotent cells undergoing developmentally induced reprogramming changes in response to signaling cascades. It will be interesting to determine the genome-wide patterns of both 5fC and 5caC in somatic samples containing globally higher levels of 5hmC modifications . However, the data suggest that it will be a challenge to detect these low-abundance modifications in WT cells without first blocking endogenous base excision repair, but perhaps there are more surprises to come.
formyl-chemically assisted bisulfite sequencing
mouse embryonic stem cell
thymine DNA glycosylase
Thanks to Dr Colm Nestor (Linköping University Hospital, Sweden) for insightful comments. Many thanks to Keith Szulwach (School of Medicine, University of Chicago, IL, USA) and Hao Wu (Harvard University, Cambridge, MA, USA) for providing 5fC and 5caC datasets. JT is a recipient of IMI-MARCAR funded career development fellowships at the MRC HGU. RM and JH are supported by Medical Research Council. Work in RM's laboratory is supported by the MRC, IMI-MARCAR and the BBSRC.