Modeling Spatio-temporal Dynamics of Chromatin Marks

To model spatial changes of chromatin mark peaks over time we developed and applied ChromTime, a computational method that predicts regions for which peaks either expand or contract significantly or hold steady between time points. Predicted expanding and contracting peaks can mark regulatory regions associated with transcription factor binding and gene expression changes. Spatial dynamics of peaks provided information about gene expression changes beyond localized signal density changes. ChromTime detected asymmetric expansions and contractions, which for some marks associated with the direction of transcription. ChromTime facilitates the analysis of time course chromatin data in a range of biological systems.


73
Computational methods that do not explicitly reason about the spatial changes of chromatin marks have 74 significant limitations for studying the dynamics of these properties because they are unable to detect 75 territorial changes that might be associated with redistribution of signal or identify asymmetric directional 76 peak boundary movements.

78
In this work, we present ChromTime, a novel computational method for detection of expanding, 79 contracting and steady peaks, which can detect patterns of changes in the genomic territory occupied by 80 chromatin mark peaks from time course sequencing data (Fig 1A). We applied ChromTime to a diverse  Fig 1B). ChromTime takes as input a set of genomic coordinates 98 of aligned sequencing reads from foreground and, optionally, control experiments over the time course.

99
The foreground experiments are data from a chromatin sequencing assay such as ChIP-seq, ATAC-seq 100 or DNase-seq performed at a series of time points. The method consists of two stagesblock finding and 101 dynamics prediction. During the block finding stage, ChromTime determines continuous genomic regions 102 (blocks) that may contain peaks of foreground signal enrichment during the time course (Fig S1A-B). To 103 achieve this, ChromTime partitions the genome into fixed length bins and counts the number of 104 foreground and control reads that map to each bin at each time point. Nearby bins that show significant 105 enrichment are joined into continuous intervals, which subsequently are grouped into blocks if they 106 overlap across time points. As a result, large portions of the genome that are likely to contain background 107 noise at all time points are filtered out, so that peak boundary dynamics are determined within a subset of 108 the genome potentially enriched for the chromatin mark.
During the dynamics prediction stage, for each block, ChromTime determines the most likely positions of 111 the peak boundaries at each time point and whether the peak expands, contracts or holds steady at each 112 boundary between consecutive time points. The method uses a probabilistic mixture model to partition the 113 signal within each block at each time point into background and peak components (Fig 1C, S1C systems. The estimated parameters are used to make a prediction for each block for the most likely 132 positions of the peak boundaries and the corresponding boundary dynamics that had generated the 133 signal within the block. The final output contains predicted peak boundaries annotated and colored by 134 their assigned dynamics, which can be used for downstream analysis with existing tools and visualized in 135 genome browsers (Fig 2, S2, https://github.com/ernstlab/ChromTime).

140
To investigate the reproducibility of ChromTime predictions, we applied ChromTime separately to two 141 biological replicate datasets for the H3K4me2 and the H3K(9/14)ac marks in T cell development in 142 mouse [17] and confirmed on average strong enrichment for the same ChromTime annotations co-143 localizing across replicates (Fig S3). We then applied the method to data from pooled replicates for the 144 H3K4me2 mark from the mouse T cell development study [17] associated with increase and decrease, respectively, in gene expression (Fig 3, S4). Additionally for all   (Fig 4, S5A). These results also held when using peaks from two different peak callers,

185
We associated the signal density changes with gene expression changes at the nearest TSS within 50kb 186 of each block and computed the average gene expression change as a function of the signal density 187 change within blocks (Fig 5). We found that locations with the same signal density change can associate with a steady call on both sides of a peak, when controlling for the signal density change. We observed a 195 similar relationship for contractions and decrease of gene expression. These results were replicated also 196 after substituting ChIP-seq signal density changes with differential peak scores from two differential peak 197 calling methods, SICER[40] and MACS2[38] (Fig S6A-B). Therefore, ChromTime predictions can provide 198 additional information about gene expression changes beyond what is contained in the corresponding 199 signal density changes as measured directly or by utilizing differential peak calling procedures.  [18], where multiple chromatin marks were 209 mapped (Fig 6, S7). In all three datasets, we observed that predicted expansions co-localized 210 preferentially for H3K4me2, H3K4me3 and H3K27ac and to a lesser extent for H3K4me1 and similarly for 211 predicted contractions and steady peaks. In contrast, different predicted spatial dynamics for H3K36me3 212 and H3K27me3 tended to occupy distinct locations. In addition, in mouse reprogramming[27], ChromTime 213 predicted dynamics of ATAC-seq, H3K4me2, H3K4me3, H3K27ac, H3K9ac and to a lesser extent of 214 H3K4me1 and H3K79me2 peaks co-localized preferentially (Fig S7). These results suggest that spatial 215 dynamics of chromatin marks are coordinated at least at a subset of genomic locations.

219
ChromTime can predict unidirectional expansions and contractions, which enables analysis of 220 directionality of spatial dynamics of peaks, an aspect of chromatin regulation that has not been previously 9 systematically explored. To investigate this, we applied ChromTime on data from 13 previously published 222 studies from a variety of developmental, differentiation and reprogramming processes ( Table 1) for nine 223 different HMs including narrow and broad marks, for Pol2, ATAC-seq and DNase-seq. We observed that 224 unidirectional expansions and contractions are predicted in most cases on average to be the majority of 225 all expansions and contractions, respectively, at a given pair of consecutive time points (Fig S8). One 226 hypothesis for the prevalence of asymmetric boundary movements for the promoter associated chromatin 227 marks is that the direction of boundary movements is associated with the asymmetry of transcription 228 initiation in promoter regions. To test this hypothesis, for each dataset we compared the prevalence of 229 each class of unidirectional dynamics as a function of its distance to the nearest annotated TSS and the 230 orientation of the corresponding gene (Fig 7). Consistent with our hypothesis, for H3K4me3, H3K4me2, 231 H3K(9/14)ac, H3K79me2, and for Pol2, we found that unidirectional expansions that expand into the gene The aim of this stage is to determine approximately the genomic coordinates of regions with potential 336 peaks of signal enrichment at any time point in the time course (Fig S1A-B). The signal within these 337 blocks will be used as input to build the mixture model in the next stage of ChromTime. ChromTime