Enhancer property | Dataset employed | Tests |
---|---|---|
Length | • DHS sites from ESC-H1 and HepG2 cell line obtained from the Roadmap Epigenome project • Bigwig files of DHS, H3K27ac, and 5 different TFs from ENCODE project | • ML models with different sized DHS inputs • Metagene plots of DHS and ChIP data • Motif enrichment in DHS data • Examining multimeric motif lengths and IC |
Uniqueness/diversity | • N/A | • Review + exploration of enhancer uniqueness from a billboard and enhanceosome perspective |
Frequency | • Hg19 human genome | • Score each base in a 500kbp sample using the 3 trained ML models to estimate discriminant threshold for 1% regulatory rate |
Turnover | • DHS sites from ESC-H1 cell line obtained from the Roadmap Epigenome project • Enhancer conservation estimates from various studies | • Use ML models to simulate dropout by mutating sequences at neutral mutation rates between species • Capture hits of multimeric motifs in genome, mutate hits at neutral rates and measure dropout after re-scanning |
Dominance of master regulators | • Motif collection • Trained ML models | • Train ML models using subset of TFs • Explore learned patterns in ls-GKM models using gkmExplain • Examine LR feature weights using multiple feature selection methods • Poisson estimates of the number of TFs required to specify regulatory sites with and without master regulators • Multimeric motif hits estimation with and without master regulators |