Skip to main content

Table 1 Comparison of reference-based and latent variable cell-type corrections in two empirical DNA methylation studies

From: Correcting for cell-type effects in DNA methylation studies: reference-based method outperforms latent variable approaches in empirical studies

  Depression MWAS study Schizophrenia MWAS study
Enrich.
ratio
Enrich.
P value
Number of
SVs
Increase r 2 Enrich.
ratio
Enrich.
P value
Number of
SVs
Increase r 2
No cell-type correction 6.04 <0.001 6.13 <0.001
Reference-based correction 1.08 0.084 0.0% 1.02 0.029 0.0%
SVA subset 1 1.11 0.001 84 8.1% 6.54 0.001 5 0.8%
SVA subset 2 1.26 0.001 83 8.7% 7.24 <0.001 10 3.3%
SVA subset 3 1.85 0.004 19 2.0% 6.45 0.001 5 0.9%
SVA subset 4 1.28 <0.001 83 9.3% 7.01 0.001 12 2.8%
SVA subset 5 3.05 0.001 14 1.9% 6.79 0.002 6 1.2%
SVA subset 6 1.30 0.001 81 7.8% 6.59 <0.001 6 0.9%
SVA subset 7 1.07 <0.001 78 9.2% 6.42 <0.001 4 0.7%
SVA subset 8 1.28 0.004 79 9.2% 6.48 <0.001 4 0.7%
SVA subset 9 1.13 0.003 84 9.2% 7.46 0.001 10 2.7%
SVA subset 10 1.07 0.003 80 8.4% 6.71 0.003 8 1.7%
SVA subset 11 1.06 0.006 21 2.7% 6.48 <0.001 8 1.3%
SVA subset 12 1.17 0.004 84 7.4% 6.49 0.001 5 0.8%
  1. We used a permutation test to examine whether our top methylome-wide association study (MWAS) results were enriched for sites showing significant methylation differences among cell types. Our test preserved the correlation structure of the data by shifting the CpG coordinates of the case-control and cell-type MWAS by a single random number in each permutation. We examined multiple cut-offs (1, 5, and 10%) to define “top results” in the case-control and cell-type data and selected the most significant combination. We accounted for this “multiple testing” by also selecting the most significant finding in each permutation. The “No cell-type correction” model includes laboratory technical covariates, age, and sex. The other models include these same covariates where the “Reference-based correction” model adds estimates of cell-type proportions, and the SVA models add latent variables. “Enrich. ratio” is the ratio of the number of CpGs showing methylation differences between cell types among the top MWAS finding relative to the number expected under the null hypotheses assuming no enrichment; “Enrich. P value” is the probability under this null hypothesis, as determined through permutations; “Number of SVs” is the number of latent variables selected by SVA; “Increase r 2” is additional variance explained by SVs in case-control status compared with a multiple regression model that included technical covariates, age/sex, and cell-type estimates. SV surrogate variable, SVA surrogate variable analysis