Skip to main content

Advertisement

Fig. 1 | Genome Biology

Fig. 1

From: FOCS: a novel method for analyzing enhancer and gene activity patterns infers an extensive enhancer–promoter map

Fig. 1

FOCS statistical procedure for inference of E–P links. In a dataset with samples from N different cell types, FOCS starts by performing N cycles of leave-cell-type-out cross-validation (LCTO CV). In cycle j, the set of samples from cell type C j is left out as a test set, and a regression model is trained, based on the remaining samples, to estimate the level of the promoter P (the independent variable) from the levels of its k closest enhancers (the dependent variables). The model is then used to predict promoter activity in the test set samples. After the N cycles, FOCS tests the agreement between the predicted (Pmodel) and observed (Pobs) promoter activities using two non-parametric tests. In the binary test, samples are divided into positive (Pobs ≥ 1 RPKM) and negative (Pobs < 1 RPKM) sets, and the ability of the inferred models to separate between the sets is examined using Wilcoxon rank-sum test. In the activity level test, the consistency between predicted and observed activities in the positive set of samples is tested using Spearman correlation. P values are corrected using the BY-FDR procedure, and promoters that passed the validation tests (FDR ≤ 0.1) are considered validated, and full regression models, this time based on all samples, are calculated for them. In the last step, FOCS shrinks each promoter model using elastic net to select its most important enhancers

Back to article page