Skip to main content
Fig. 8 | Genome Biology

Fig. 8

From: CHROMATIX: computing the functional landscape of many-body chromatin interactions in transcriptionally active loci from deconvolved single cells

Fig. 8

Predictive model for principal loop enrichment. a Publicly available biological datasets (Additional file 3: Table S2), primarily from ENCODE reference epigenome for GM12878 (ENCSR447YYN) [53, 54], were used as predictive inputs to a random forest [55, 56] machine learning classifier. Illustrative signals shown are from the UCSC genome browser [76, 77] for locus chr 12: 11,690,000–12,210,000. b Cartoon illustration of enriched versus not enriched regions. Genomic regions, each corresponding to a non-overlapping 5-KB bin, were sorted based on principal loop participation; a subset of those occurring above the elbow inflection point were labeled as enriched; those occurring below the inflection point were labeled as not enriched (see the “Methods” section). c Receiver operating characteristic (ROC) curve [78] showing performance of our random forest classifier in discriminating principal loop enriched from not enriched genomic regions. Trained random forest model showed a mean area under the curve (AUC) of 0.805 on test set and a mean out-of-bag (OOB) error, an unbiased estimate of generalization error [55], of 21.5% over 5-fold cross-validation

Back to article page