Circadian A/B chromatin compartments switch between open and closed configurations throughout the day
To study global genome architecture during a circadian cycle, we performed in-nucleus Hi-C (see “Methods”) on mouse adult liver at four different timepoints of the circadian cycle (ZT0, 6, 12 and 18), with ZT0 and ZT12 being the start of the light and dark phase, respectively, in three biological replicates. Samples of individual livers were processed in parallel for RNA-seq. We produced high-quality Hi-C data sets with a high percentage of valid pairs (~ 80%), low PCR duplicates (less than 2%), and high cis:trans interaction ratios obtained (~ 80:20%) (Table S1). In total, we obtained ~ 2 billion valid Hi-C read pairs from mouse adult liver across a circadian cycle (Table S1).
To detect “open,” transcriptionally active and “closed,” silent genomic compartments (A and B compartments, respectively), we performed PCA analysis on Hi-C data at different timepoints throughout the circadian cycle, at 100-kb bin resolution. PCA analysis is used to analyze high dimensional data and we redefine them, with as few dimensions as possible that explain most of the variance in the data. As such, the 1st principal component (PC1) will explain the majority of the variance, followed by the second component (PC2) and so forth. When applying PCA to the normalized contact Hi-C matrix from individual chromosomes, important features can be identified. For most chromosomes, the PC1 value reflects two distinct interaction compartments that correspond to open and closed chromatin [10]. Changes in chromatin compartments have been associated with changes in transcription and chromatin states during cell differentiation and mouse early development [11, 12]. As expected, PC1 values partitioned the liver genome into chromatin compartments (Fig. 1a,b, Additional file 1: Figure S1A,B). We then compared the eigenvectors of the different timepoints and identified changes in the sign of regional PC1 values, indicative of compartment switching between all timepoint pairs (Fig. 1a,b, individual replicates and merged replicates, respectively, Additional file 1: Figure S1C one-way ANOVA p value < 2e − 16). These genomic regions, termed oscillatory chromatin compartments (OCCs) spanned 440.4 Mb of the mouse genome. The rest of the genome (82.7%) retained the same compartment identity during the 24-h cycle (Additional file 1: Figure S1A,B individual replicates and merged replicates, respectively, S1D). We found OCCs with compartment assignments ZT0 = A, ZT6 = A, ZT12 = B, ZT18 = A (AABA) being the most abundant type in the genome covering 194.7 Mb (Additional file 1: Figure S1E).
To relate chromatin compartments with transcription, we performed stranded total RNA-seq. We obtained ~ 500,000,000 of 150-bp reads per timepoint (4 biological replicates each) for a total of ~ 2,000,000,000 reads (Table S2). Spearman correlation analysis showed good correlation between the 4 biological replicates (Additional file 2: Figure S2A, ZT0 and 12 shown). Genes with differential expression between at least one pair of timepoints were identified (q-value < 0.01) and classified as circadian. In total, we detected 1257 circadian gene transcripts (Additional file 2: Figure S2B, Table S3). Inspection of individual gene expression profiles from our RNA-seq showed the expected oscillatory expression pattern for examples of both core-clock and output circadian genes in the liver (Additional file 2: Figure S2C,F). Gene Ontology and KEGG pathway analysis identified circadian rhythm and metabolism as significantly enriched categories in our identified circadian gene set (Additional file 2: Figure S2D,E) confirming efficient detection of circadian oscillating genes. RT-qPCR of primary and mature mRNAs confirmed oscillation of circadian gene expression (Additional file 2: Figure S2G). Analysis of the RNA content in A and B compartments at all timepoints revealed that A compartments are RNA-rich and B are RNA-poor (Fig. 1c, p < 0.0001, one-way ANOVA test). The RNA content in OCCs falling into A or B at different times of the day was also significantly different (Fig. 1d, p < 0.0001, Mann-Whitney test). In addition to RNA abundance, we analyzed time resolved enrichment of histone modifications characteristic of open chromatin H3K4me3 and H3K4me1 [13] in OCCs around the clock. Both histone marks were significantly enriched in regions at times of A compartment assignment compared to B compartment across all timepoints (Fig. 1e, AABA [ZT0 = A, ZT6 = A, ZT12 = B, ZT18 = A] and BABB [ZT0 = B, ZT6 = A, ZT12 = B, ZT18 = B], Additional file 1: Figure S1F, AABB, ABBB, BABA. All p values < 0.001, one-way ANOVA test, Tukey post hoc test). Histone deacetylase 3 (HDAC3) binding to chromatin is enriched in deacetylated “closed” chromatin and it has been measured in mouse liver at ZT22 and ZT10 [14]. We quantified HDAC3 occupancy at OCCs at ZT0 and 12 (our closest timepoints to ZT22 and 10). HDAC3 binding was higher at OCCs that fell into the B compartment at ZT12 compared to the A compartment at ZT0 (Fig. 1f, Wilcoxon p < 0.001). Examples of OCCs overlapping the circadian Arntl gene TAD and the circadian Npas2 gene TAD are shown in Fig. 1g and h respectively. For all circadian genes presented as examples throughout the paper, the RNA-seq signal tracks for the four timepoints examined can be found in Additional file 3: Figure S3. Together, these results show that both transcription and chromatin state fluctuate in accordance with compartment switching during a circadian cycle.
Topologically associated domains spatially partition temporal gene expression control but remain structurally invariant during a circadian cycle
To identify TADs, we assigned TAD insulation scores to the Hi-C contact matrices [15] and examined them across timepoints (see “Methods”). TADs displayed little variation across timepoints as has been previously observed [9]. Of the total of 4358 TADs that we identified, 2936 were preserved throughout the day and 952 at least between two timepoints and presented a size distribution between 150 kb and 1.5 Mb (Fig. 2a and Additional file 4: Figure S4A,B). To confirm insulation of TADs across timepoints, we selected a random set of 1000 TADs detected at ZT0 and plotted their median observed/expected contacts using Hi-C data from ZT0 and ZT12 (see “Methods”). We recovered higher contact frequencies within than outside TADs at both timepoints confirming preservation of domain structures across timepoints (Fig. 2b left panel). The same result held true for TADs detected at ZT6, 12 and 18 h (Additional file 4: Figure S4C).
CTCF functions as an architectural protein that establishes chromatin domains together with cohesin in mammalian chromatin [16,17,18]. We performed ChIP-seq against CTCF at ZT0 and ZT12 on the same liver samples used for Hi-C. In total, 33,262 CTCF binding sites (75.3%), out of a total of 44163 sites, were shared between ZT0 and 12 and CTCF showed similar enrichment between the two timepoints (Additional file 4: Figure S4D). When plotting the Hi-C contacts in an aggregate peak analysis centered on the CTCF binding sites, CTCF-bound regions exhibited robust contact insulation properties, independent of the timepoint examined (Fig. 2b right panel and Additional file 4: Figure S4E top panels). Additionally, we assessed preservation of CTCF insulation between mouse ES cells and liver cells by overlaying regions occupied by CTCF in mESCs [19], onto our liver Hi-C data sets. Robust insulation was observed when using either ZT0 or ZT12 Hi-C data suggesting large agreement of CTCF chromatin occupancy and insulation properties between mESCs and adult liver tissue (Additional file 4: Figure S4 E bottom panels). The conservation of CTCF binding between ESCs and adult liver tissue shows that the insulation of circadian genes by CTCF precedes the onset of their rhythmic transcription, as ESCs do not harbor functional circadian oscillators [20].
Next, we assigned genes to TADs. We found that on average TADs harbored 7.2 genes. TADs containing one or more circadian genes (cTADs) were larger than non cTADs and on average contained more genes (14.2) (Additional file 4: Figure S4F, all p values < 2.2e−16, Wilcoxon rank sum test). We observed that 70% of cTADs contained only one circadian gene (Fig. 2c). The remaining cTADs contained more than one circadian gene and remarkably, the circadian genes sharing TADs exhibited peak transcriptional expression at shared times during the day (40% for TADs with 2 circadian genes compared to the expected 28%, 18% of TADs with 3 circadian genes compared to the expected 8.8%, p < 0.0001, chi square test) (Fig. 2d and e, only data from cTADs with 2 circadian genes shown). We next analyzed whether TADs encoding circadian genes switched chromatin compartment over the circadian cycle. Indeed, most cTADs (73%, 623) overlapped with OCCs more than expected when compared to a random set of the same number of non-cTADs (Wilcoxon test, p < 0.0001) (Fig. 2f). Examples of cTADs containing one circadian gene (cTADs with Mical2, MicalcI, Arntl and Copb1, respectively), or more than one circadian gene (cTAD with Wee1 and Swap70 genes, both with acrophase at ZT12) are shown in Fig. 2g (left and right panels respectively). Examples of cTADs overlapping OCCs are shown for Arntl (Fig. 1g) and Npas2 (Fig. 1h) as previously described. These results show that cTADs often set transcriptional phase coherence between multiple circadian genes in the same TAD. While most TADs maintain their structural boundaries over time, they overlap with compartments that switch between active and silent states throughout a circadian cycle.
Gene promoter-promoter interactions in the liver and its circadian component
To gain insights into chromatin contacts at the level of individual circadian genes, we measured genome-wide promoter-promoter and promoter-regulatory element contacts at four timepoints during a circadian cycle using Promoter-CHi-C (Fig. 3a) [21,22,23]. Promoter-containing ligation products from Hi-C libraries were efficiently captured (~ 70%) using 39,021 RNA probes, which hybridize to 22,225 genomic restriction fragments covering all annotated gene promoters in the mouse genome. We produced ~ 1,560,000,000 total valid read pairs from the three biological replicates for the four timepoints, thus obtaining ~ 390,000,000 valid ligation products per timepoint (Table S4). Capture of gene promoters increased the number of valid ligation products per promoter to ~ 10–15 fold compared to Hi-C. An example of this enrichment is shown for the Arntl gene locus comparing a virtual 4C from the Arntl gene promoter performed on the Hi-C data versus the Promoter-CHi-C chromatin contact data sequenced at equivalent depth (Additional file 5: Figure S5A and S5B comparison of raw paired reads per restriction fragment in CHi-C vs Hi-C from the virtual 4C from the Arntl gene promoter p < 0.0001, Wilcoxon rank test). We estimated the statistical significance of interactions between pairs of promoters and between promoter and other, potentially gene regulatory genomic regions, using the CHiCAGO pipeline for each timepoint (“Methods”) [24]. We obtained ~ 150,000 statistically significant interactions per timepoint resulting in ~ 600,000 statistically significant promoter interactions in total (Table S4).
First, we focused on gene promoter-promoter contacts. We built promoter-promoter networks at all timepoints and found large disconnected networks as expected for promoters scattered across the different chromosomes (Additional file 5: Figure S5C, ZT0 shown). We then evaluated the larger clusters of the promoter-promoter network containing hundreds of connected gene promoters (Additional file 5: Figure S5D, the 4 larger clusters shown including inter- and intrachromosomal promoter-promoter interactions) and performed gene enrichment analysis on these using gProfiler [25] (Additional file 5: Figure S5E). We found the Major Histocompatibility Complex forming one cluster with genes encoded on chromosomes 7 and 17. MHC genes are lowly expressed in the healthy liver and constitute dense, highly connected chromatin [26] (Additional file 5: Figure S5D and S5E, cluster 1). Similarly, a transcriptionally repressed cluster is formed between olfactory receptor genes on chromosome 7 (Additional file 5: Figure S5D and S5E, cluster 3). In addition to repressed genes clustering in spatial proximity, we identified actively transcribed genes involved in glutathione synthesis and amino acid metabolism essential for liver detoxification function arranged in a promoter network (Additional file 5: Figure S5D and S5E, cluster 2). Another cluster encompassed constitutive histone genes located in chromosomes 3, 13, 11, 18, 12, and 7, among others (Additional file 5: Figure S5D and S5E, cluster 4). Thus, prominent constitutive and liver-specific promoter-promoter networks both transcriptionally active and inactive were identified in the adult liver.
Next we examined the circadian component in promoter-promoter networks. Circadian promoters are dispersed across chromosomes (Additional file 6: Figure S6A, blue dots represent circadian gene promoters). Nevertheless, circadian promoters establish significantly more contacts among them compared to non-circadian gene promoters (Additional file 6: Figure S6B. Above, comparison of the number of edges formed between circadian promoters compared to a random set of non-circadian gene promoters. Below, Z-scores compared to the random sampling of non-circadian promoters). Next we looked at the reads supporting significant interactions between circadian gene promoters in contrasts with interactions between non circadian gene promoters. The result shows that circadian promoter-promoter contacts are more robust compared to non-circadian promoter-promoter contacts (Additional file 6: Figure S6C, p values < 0.001, Mann-Whitney test). Finally, we compared the time of maximal mRNA abundance of circadian genes whose promoters were contacting each other. To do this, we separated the circadian gene promoters into diurnal and nocturnal based on their transcriptional acrophase and then analyzed the time of maximal RNA expression of their contacted circadian gene promoters. We found that diurnal promoters preferentially contacted other diurnal genes. Likewise, nocturnal circadian genes preferentially contacted other genes with nocturnal expression maxima (Additional file 6: Figure S6D, our intronic circadian gene set, p values < 0.0001 Wilcoxon signed rank test). Similar results were obtained for circadian genes detected by GROseq [27] reflecting primary transcription (Additional file 6: Figure S6E, p values < 0.0001 Wilcoxon signed rank test). This is exemplified by the interaction between the Tef and the Aco circadian gene promoters with shared pre-messenger expression peak at ZT12 (Additional file 6: Figure S6F) as well as the interaction between Rorc and Cgn circadian genes with shared acrophase at ZT18 (Additional file 7: Figure S7A). Thus, in summary the results show that circadian promoter-promoter contacts are more robust than non-circadian promoter-promoter contacts. Furthermore, pairs of interacting circadian gene promoters tend to share their transcriptional acrophase.
Regulatory elements form dynamic and stable chromatin contacts with circadian gene promoters
In order to characterize the full range of promoter interactions, we next examined contacts between promoters and non-promoter genomic regions. Genomic regions that interact with gene promoters including circadian gene promoters in mouse liver, were enriched for histone modifications characteristic of regulatory elements such as H3K27ac, H3K4me1, H3K4me3, and H3K27me3, as well as DNase I hypersensitive sites and the structural protein CTCF [13, 28, 29], compared to distance matched non-interacting regions (Fig. 3b left panel, all p values < 8e−166, t test, CTCF, p value <8e−18; t test). A set of enhancers from which enhancer RNAs (eRNAs) are produced have been described in the liver [27]. We found that these enhancers were enriched at the contacted regions (Fig. 3b, left panel p value < 9e−165; t test). Overall, the chromatin features at promoter-contacting regions suggest that our P-CHi-C captured potential structural and/or regulatory elements.
Besides enhancers and open chromatin marks, we found that promoter-interacting regions were enriched for the occupancy of core clock TFs [13] (Fig. 3c, left panel, pval<8e−100, t test). When measuring the same enrichments at only circadian gene promoters interacting regions, both for the full set of our circadian genes and a subset corresponding to genes oscillating at the intronic level (see “Methods”), we found a significant preference for circadian gene promoters to contact with enhancers both detected by eRNA transcription or histone modifications, as well as regions occupied by core clock TF (Fig. 3b, c, right panels; all p values <8e−216, t test). We next assessed the rhythmicity of contacts between regulatory elements and circadian promoters during a circadian cycle. We identified dynamic genomic contacts involving circadian promoters using two distance regimes as described (see “Methods”). In total, we found 13,782 stable and 6047 dynamic contacts for 1195 circadian promoters and found enhancers preferentially engage in dynamic contacts (Fig. 3d). We next analyzed the number of interactions made by circadian promoters at different timepoints. We found that the number of contacts of circadian gene promoters peak during or around the phase of maximal mRNA abundance (chi square, p < 0.001) (Fig. 3e) (see different examples of circadian gene promoter expression profiles Additional file 3: Figure S3 and virtual 4C interaction landscapes across the paper Fig. 4e,h,i, Additional file 6: Figure S6F, Additional file 7: Figure S7A-F). This suggests that more regulatory elements are contacted at the time of peak transcription contributing to circadian gene regulation.
Next we identified transcription factor binding motifs at genomic regions involved in constant or rhythmic interactions with circadian promoters in an unbiased manner using the MEME suite (“Methods”) [30]. We found TF binding motifs both in regions forming constant and dynamic contacts with circadian promoters. Other DNA binding motifs were found enriched in regions engaged in either dynamic or constant contacts with circadian gene promoters (Fig. 3f, Table S5). The binding motif of the nuclear receptor Nr5a2/LRH-1 was highly enriched at interacting regions of rhythmic gene promoters. Nr5a2 is a key metabolic sensor that modulates bile acid synthesis and cholesterol homeostasis by regulating the expression of Cyp7a1 and Cyp8a1 circadian genes among others, and controls triglyceride synthesis and lipid composition and metabolism [31,32,33,34]. Interestingly, a pro-inflammatory function of Nr5a2 deficiency is linked to non-alcoholic fatty liver disease [35]. In addition, loss of Nr5a2 in the adult liver leads to disruption of hepatic lipid homeostasis and composition [36]. To confirm Nr5a2 occupancy at circadian gene promoters and interacting regions, we performed Chromatin IP (ChIP-qPCR) against Nr5a2 at the four timepoints during the day for three loci in which the DNA binding motif was found enriched. We found Nr5a2 dynamically occupying the Arntl circadian gene promoter with peak occupancy at ZT6 when Arntl gene expression starts to decrease (Fig. 3g, Additional file 3: Figure S3). A similar binding pattern over time was observed on the Arntl interacting enhancer (Fig. 3g). We then analyzed Nr5a2 binding to a chromatin region stably contacting the circadian gene promoter of Ppp1r3c. Interestingly, we found that Nr5a2 binding was dynamic with maximum occupancy of Nr5a2 at this region is also dynamic and peaks at ZT6 when Ppp1r3c gene expression starts to decrease (Fig. 3g, Additional file 3: Figure S3). In addition, we assessed Nr5a2 occupancy in a chromatin region dynamically interacting with the circadian gene promoter of Gsk3a but did not find the receptor bound to this region (Fig. 3g). Thus, our results suggest that Nr5a2 is dynamically occupying regulatory elements of circadian genes irrespective of whether the contacts to circadian gene promoters are constant or rhythmic over time.
Other binding sites for nuclear receptors enriched in regions contacting circadian promoters included VDR (vitamin D receptor) and ESR2 (estrogen receptor 2).
The Tcfap2c/AP-2 gamma binding motif was found to be highly enriched at dynamic interacting regions of circadian gene promoters. In the liver, Tcfap2c has been associated with repression of fatty acid synthesis pathways [37] and was identified as a key TF involved in lipid droplets biogenesis [38]. Fos:Jun/AP1 binding motifs were found in genomic regions forming stable contacts with circadian promoters. AP1 factors are a well-characterized immediate-early transcription factors induced in response to signals in the serum and that regulate the expression of circadian genes in liver and cultured cells [39] as well as the suprachiasmatic nucleus [40,41,42]. Recently, AP-1 was shown to bring together key genes and enhancers through stable and dynamic loops during macrophage development bringing together key macrophage genes and enhancers [43].
In summary, a set of DNA binding motifs for distinct liver nuclear receptors and immediate early genes are enriched in regions contacting circadian promoters and could function in the wiring of the circadian promoter 3D interactome in the liver.
Circadian gene promoters interact with diurnal and nocturnal enhancers in the nuclear space
A set of enhancers are transcribed in a circadian fashion in the mouse liver [27]. We found that these enhancers preferentially contact circadian gene promoters, suggesting that rhythmically transcribed genomic regions, protein-coding and non-coding, interact with each other in the nuclear space (Fig. 4a). This is also true for our subset of circadian genes oscillating at the intronic level (see “Methods”) and for circadian genes detected through GRO-seq [27] reflecting primary transcriptional oscillation (Fig. 4b,c).
We then compared the transcriptional phases between promoters of circadian genes and their corresponding contacted enhancer elements with rhythmic transcription. To do so, we separated the circadian gene promoters into diurnal and nocturnal depending on their transcriptional acrophase and then analyzed the time of maximal RNA expression of their contacted enhancer elements. We found a significant contact preference between diurnal promoters and diurnal enhancers as well as nocturnal promoters and nocturnal enhancers (Fig. 4d, left). These preferences were more pronounced when analyzing our circadian intronic gene set reflecting primary transcription (see “Methods”) (Fig. 4d, center) or detected by GRO-seq (Fig. 4d, right, all p values < 0.0001 Wilcoxon signed rank test). For example, the Rnf125 circadian gene promoter with peak transcription at ZT0 contacts 12 rhythmically expressed enhancers with acrophases between 19 and 1 h during the circadian cycle. Furthermore, as it can be observed, the number of contacts with the enhancers, increase during the acrophase (Fig. 4e).
The core clock gene promoter contacts
Finally, we focused on the genomic interactions formed by circadian core clock gene promoters including Npas2, Clock, Arntl, Cry1, Cry2, Per1, Per2, Rorc, Nr1d1, and Nr1d2 as defined by [44]. Notably, all core clock genes displayed fewer overall contacts compared to a random set of the same number of other circadian genes in the liver (12 vs 19.6 mean number of contacts for core-clock vs other circadian genes, Fig. 4f, p < 0.0001, t test). However, the contacts formed by the core clock gene promoters were more dynamic than a random set of the same number of contacts for other circadian genes in the liver (42.3% vs 26.8% mean proportion of dynamic contacts for core clock vs other circadian genes, Fig. 4g, p < 0.0001, t test). For instance, the Arntl circadian gene promoter engages in contacts with two enhancer elements at ZT18, the time when Arntl expression increases, in three contacts at ZT0, at time of maximal transcriptional output, and does not engage in contacts at either ZT6 or ZT12, when Arntl1 transcription decreases (Fig. 4h, see Additional file 3: Figure S3 for the expression profile). The Nr1d1 gene promoter engages in more contacts at the gene’s maximal time of expression, around ZT6 (Fig. 4i, see Additional file 3: Figure S3 for the expression profile). In contrast to core clock contact profiles (additional examples are shown for Rorc, Nr1d2, Npas2, and Per2 (Additional file 7: Figure S7A-D, see Additional file 3: Figure S3 for expression profiles), promoters of circadian output genes engage in numerous contacts that are constant during the day as exemplified by Dhr3 and Ppp1r3c gene promoters (Additional file 7: Figure S7E and F, see Additional file 3: Figure S3 for expression profiles). In conclusion, core-clock gene promoters engage in more dynamic genomic contacts compared to other circadian genes in the liver.