Skip to main content

Table 1 Enhancer characteristics examined in this study

From: Global properties of regulatory sequences are predicted by transcription factor recognition mechanisms

Enhancer property

Dataset employed

Tests

Length

• DHS sites from ESC-H1 and HepG2 cell line obtained from the Roadmap Epigenome project

• Bigwig files of DHS, H3K27ac, and 5 different TFs from ENCODE project

• ML models with different sized DHS inputs

• Metagene plots of DHS and ChIP data

• Motif enrichment in DHS data

• Examining multimeric motif lengths and IC

Uniqueness/diversity

• N/A

• Review + exploration of enhancer uniqueness from a billboard and enhanceosome perspective

Frequency

• Hg19 human genome

• Score each base in a 500kbp sample using the 3 trained ML models to estimate discriminant threshold for 1% regulatory rate

Turnover

• DHS sites from ESC-H1 cell line obtained from the Roadmap Epigenome project

• Enhancer conservation estimates from various studies

• Use ML models to simulate dropout by mutating sequences at neutral mutation rates between species

• Capture hits of multimeric motifs in genome, mutate hits at neutral rates and measure dropout after re-scanning

Dominance of master regulators

• Motif collection

• Trained ML models

• Train ML models using subset of TFs

• Explore learned patterns in ls-GKM models using gkmExplain

• Examine LR feature weights using multiple feature selection methods

• Poisson estimates of the number of TFs required to specify regulatory sites with and without master regulators

• Multimeric motif hits estimation with and without master regulators