Skip to main content
Figure 3 | Genome Biology

Figure 3

From: A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets

Figure 3

Hierarchical clustering using either chromatin feature profiles (a-c) or bin profiles (d-f) discriminates highly and lowly expressed genes. (a) Hierarchical clustering of 16 chromatin features in bin 1 (0 to 100 nucleotides upstream of a TSS). The resulting tree is split at the top branch, which divides genes into two clusters, cluster H and cluster L, as labeled. (b) Distributions of expression levels of genes in cluster H (red) and cluster L (green). Expression levels are significantly different between the two clusters according to t-test (P = 3E-202). Expression levels were measured by RNA-seq (see Materials and methods). (c) T-scores for the differential expression of the top two gene clusters based on hierarchical clustering of chromatin features in each of the 160 bins. For each bin, hierarchical clustering was performed to separate genes into two clusters. Expression levels between the two clusters were compared and a t-score calculated to measure the capability of the bin to discriminate between genes with high and low expression levels. (d) Hierarchical clustering of the genes based on the signal profiles of H3K79me2 across the 160 bins. The resulting tree is also split at the top branch, leading to two gene clusters. (e) Distributions of expression levels of genes in the two clusters in (d). The expression levels are significantly different according to t-test (P = 4E-93). (f) T-scores for the differential expression of the two gene clusters based on hierarchical clustering of bin profiles for each individual chromatin feature. Cyan and blue colors indicate a significant positive and negative correlation between a chromatin feature and gene expression levels, respectively. Black color indicates that a chromatin feature could not significantly discriminate between genes with high and low expression levels. To visualize the clustering, 2,000 randomly selected genes are shown. The data for gene expression levels and chromatin features are from the EEMB stage.

Back to article page