An evaluation of methods correcting for celltype heterogeneity in DNA methylation studies
 Kevin McGregor^{1, 2},
 Sasha Bernatsky^{2},
 Ines Colmegna^{7},
 Marie Hudson^{2, 3, 6},
 Tomi Pastinen^{4, 5},
 Aurélie Labbe^{1, 8, 9} and
 Celia M.T. Greenwood^{2}Email author
Received: 1 March 2016
Accepted: 5 April 2016
Published: 3 May 2016
Abstract
Background
Many different methods exist to adjust for variability in celltype mixture proportions when analyzing DNA methylation studies. Here we present the result of an extensive simulation study, built on cellseparated DNA methylation profiles from Illumina Infinium 450K methylation data, to compare the performance of eight methods including the most commonly used approaches.
Results
We designed a rich multilayered simulation containing a set of probes with true associations with either binary or continuous phenotypes, confounding by cell type, variability in means and standard deviations for population parameters, additional variability at the level of an individual celltypespecific sample, and variability in the mixture proportions across samples. Performance varied quite substantially across methods and simulations. In particular, the number of false positives was sometimes unrealistically high, indicating limited ability to discriminate the true signals from those appearing significant through confounding. Methods that filtered probes had consequently poor power. QQ plots of p values across all tested probes showed that adjustments did not always improve the distribution. The same methods were used to examine associations between smoking and methylation data from a case–control study of colorectal cancer, and we also explored the effect of celltype adjustments on associations between rheumatoid arthritis cases and controls.
Conclusions
We recommend surrogate variable analysis for celltype mixture adjustment since performance was stable under all our simulated scenarios.
Keywords
DNA methylation Celltype mixture Deconvolution Matrix decompositionBackground
DNA methylation is an important epigenetic factor that modulates gene expression through the inhibition of transcriptional proteins binding to DNA [1]. Examining the associations between methylation and phenotypes, either at a few loci or epigenomewide (i.e. the EpigenomeWide Association Study (EWAS) [2]) is an increasingly popular study design, since such studies can improve understanding of how the genome influences phenotypes and diseases. However, unlike genetic association studies, where the randomness of Mendelian transmission patterns from parents to children enables some inference of causality for associated variants, results from EWAS studies can be more difficult to interpret.
The choices of tissue for analysis and time of sampling are crucial, since methylation levels vary substantially across tissues and time. Methylation plays a large role in cellular differentiation, especially in regulatory regions [3, 4], and methylation patterns are largely responsible for determining celltypespecific functioning, despite the fact that all cells contain the same genetic code [5].
Ideally, methylation would be measured in tissues and cells of most relevance to the phenotype of interest, but in practice such tissues may be impossible to obtain in human studies. Many accessible tissues for DNA methylation studies, such as saliva, whole blood, placenta, adipose, tumors, or many others, will contain mixtures of different cell types, albeit to varying degrees. Hence, the measured methylation levels represent weighted averages of celltypespecific methylation levels, with weights corresponding to the proportion of the different cell types in a sample. However, celltype proportions can vary across individuals, and can be associated with diseases or phenotypes [6]. For example, individuals with autoimmune disease are likely to have very different proportions of autoimmune cells in their blood than nondiseased individuals [7–11], synovial and cartilage cell proportions differ between rheumatoid arthritis patients and controls [12], and associations with age have been consistently reported [13]. Hence, variable celltypemixture proportions can confound relationships between locusspecific methylation levels and phenotypes, since these proportions are associated both with phenotype and with methylation levels [14].
In situations potentially subject to confounding, although less biased estimates of association can be obtained by incorporating the confounding variable as a covariate, this is not a perfect solution, since it may not be possible to distinguish lineage differences [14] or to estimate accurately the proportions of each cell type in a tissue sample [15, 16]. Initial studies of associations between DNA methylation and phenotypes largely ignored this potential confounding factor, which may have led to biased estimates of association and failure to replicate findings [17, 18].
However, in parallel with the increasing prevalence of highdimensional methylation studies, a number of methods that can account for this potential confounding of methylation–phenotype associations have been developed or adapted from other contexts. Among those developed specifically for methylation data (Refbased [19], Reffree [20], CellCDec [21], and EWASher [22]), the first two were proposed by the same author (Houseman), but the first of these requires an external reference data set. Other methods were proposed in more general contexts where confounding does not necessarily result from celltype mixtures yet is still of concern; many of these rely on some implementation of matrix decompositions (SVA [23], ISVA [24], Deconfounding (Deconf) [25], and RUV [26, 27]).
Although there are numerous similarities between the approaches, there remain some fundamental differences in terms of limitations and performance. An unbiased comparison of methods has been difficult since true celltype mixture proportions are unknown, replications using alternative technologies such as targeted pyrosequencing do not lead to genomewide data where celltype proportions can be estimated, and new methods have tended to be compared with only a few other approaches. Since the problem of confounding plagues all researchers in this field, a careful comparison of existing methods correcting for celltype heterogeneity is essential, and this is the objective of our paper.
In an ongoing study of incident treatmentnaive patients with one of four systemic autoimmune rheumatic diseases (SARDs), wholeblood samples were taken at presentation, and immune cell populations (purity >95 %) were sorted from the peripheral mononuclear cells (PBMCs) of these patients. Analysis of DNA methylation was then performed with the Illumina Infinium HumanMethylation450 BeadChip (450K) on the cellseparated data. These data provide a unique and valuable opportunity to compare the performance of methods for celltype mixture adjustments. We present here the results of an extensive simulation study where we remixed the cellseparated methylation profiles to incorporate variable mixture proportions and confounding of associations, and we then compare performance of eight different methods of adjustment. We also compare the ease of use of each method, and provide an R script allowing for easy implementation of several of the bestperforming methods. As far as we are aware, this is the first study to compare such an extensive set of methods in a simulation based on cellseparated data.
Results
Patients and original methylation profiles
Wholeblood samples were obtained from patients with incident treatmentnaive rheumatoid arthritis (n=11), systemic lupus erythematosus (n=9), systemic sclerosis (n=14), and idiopathic inflammatory myositis (n=3). Several control samples were available as well (n=9). CD4, CD19, and CD14 subpopulations were sorted from PBMCs by magnetic cell isolation and cell separation (MACS) sorting (see “Methods”). The purity of the isolated populations was confirmed by flow cytometry. Only in samples with a purity >95 % were methylation profiles assessed using the Illumina Infinium HumanMethylation 450 BeadChip on the separate cell populations. Our simulation and results are based primarily on 46 patients for whom cellsorted methylation profiles were available for both CD4 ^{+} T lymphocytes and CD14 ^{+} monocytes.
Multilayer simulation design
We implemented a rich simulation design, based on the SARD methylation data. This simulation contains random sources of variability at multiple levels, including both variability of population mean parameters as well as variability at individuallevel parameters. Starting with the observed cellseparated methylation profiles for T cells and monocytes from the SARD data, we simulated a number of probes to have associations directly with the phenotype, and we induced confounding by combining the two cell types in proportions that vary across individuals. Although this simulation design is complex and depends on a large number of parameters, it allows substantial flexibility in specifying consistency or variability between cell types, individuals, or probes, and easily allows us to create realistic and pathological situations in the same framework.
 1.
Select a set of S CpG sites where “true” associations with a phenotype will be generated by our simulation; we refer to these CpGs as differentially methylated sites (DMSs).
 2.
Generate a phenotype, either binary (disease or no disease) or continuous.
 3.
For any probe not in the DMS set, the celltypespecific methylation values are the observed values from the real data.
 4.
For a probe in the DMS set, a randomly generated quantity is added to the observed celltypespecific level of methylation, in a way that depends on the phenotype.
 5.
The celltypespecific methylation values are mixed together in proportions that vary depending on the phenotypes.
Over all DMSs, one would expect to see a range of positive and negative associations with the phenotype. In step 4, we allow these associations to differ between cell types in order to specify an association between change in methylation and cell type. After having specified each of the site and celltypespecific associations, we then add betweensubject variability to each site. The final step, step 5, leads to methylation proportions as they would appear had the mixed tissue been analyzed directly.
Fixed parameters in the simulation design
Parameter  Description 

S  Number of CpGs chosen to be associated with phenotype in simulation 
μ _{ k }  Mean of the celltypespecific DMS effects for cell type k, over all S DMSs 
σ _{ k }  Standard deviation of celltypespecific DMS effects for cell type k, over all S DMSs 
σ _{ jk }  Variability of individual deviations at probe j in cell type k 
\(\alpha ^{(0)} = \left (\alpha _{1}^{(0)}, \alpha _{2}^{(0)}\right)^{\top }\)  Expected proportion of the mixture from cell types 1 and 2 when the phenotype z _{ i } is zero. 
\(\alpha ^{(Z)} = \left (\alpha _{1}^{(Z)}, \alpha _{2}^{(Z)}\right)^{\top }\)  Average celltype mixture proportions for cell types 1 and 2 for subjects with phenotype level Z (continuous or binary) 
ρ  Precision of simulated cell mixture distributions. A greater value corresponds to more clearly defined differences in celltype proportions with respect to the phenotype 
Parameter choices for the simulation scenarios
Simulation scenario  (μ _{1},μ _{2})  (σ _{1},σ _{2})  σ _{ jk }  ρ  α ^{(0)}  α ^{(1)}−α ^{(0)} ^{a}  

1  Distinct differences  (−0.05,0.5)  (0.05, 0.75)  Unif(0.1, 2)  100  \(\left [\begin {array}{c} 0.57 \\[5pt] 0.43 \end {array}\right ]\)  \(\left [\begin {array}{r}0.08 \\[5pt] 0.08\end {array}\right ] \) 
2  No confounding  (0.25, 0.25)  (0.5, 0.5)  0.1  100  \(\left [\begin {array}{c} 0.57 \\[5pt] 0.43 \end {array}\right ]\)  \(\left [\begin {array}{r}0.08 \\[5pt] 0.08 \end {array}\right ]\) 
3  Opposite effects  (−0.75,0.75)  (0.1, 0.1)  0.1  100  \(\left [\begin {array}{c}0.57 \\[5pt] 0.43 \end {array}\right ]\)  \(\left [\begin {array}{r}0.08 \\[5pt] 0.08 \end {array}\right ]\) 
4  High precision  (0.3, 0.1)  (0.1, 0.1)  0.1  200  \(\left [\begin {array}{c}0.57 \\[5pt] 0.43 \end {array}\right ]\)  \(\left [\begin {array}{r}0.08 \\[5pt] 0.08 \end {array}\right ]\) 
5  Low precision  (0.3, 0.1)  (0.1, 0.1)  0.1  10  \(\left [\begin {array}{c}0.57 \\[5pt] 0.43 \end {array}\right ]\)  \(\left [\begin {array}{r}0.08 \\[5pt] 0.08 \end {array}\right ]\) 
6  Continuous phenotype  (−0.05,0.25)  (0.05, 0.15)  0.1  100  \(\left [\begin {array}{c}0.57 \\[5pt] 0.43 \end {array}\right ]\)  \(\left [\begin {array}{r}0.03 \\[5pt] 0.03 \end {array}\right ]\) 
7  Few assoc. sites  (1, 0.95)  (0.05, 0.05)  Unif(0.1, 2)  100  \(\left [\begin {array}{c}0.57 \\[5pt] 0.43 \end {array}\right ]\)  \(\left [\begin {array}{r}0.08 \\[5pt] 0.08 \end {array}\right ]\) 
8  Many assoc. sites block 1 ^{b}  (0.1,0.4)  (0.01, 0.01)  Unif(0.1, 2)  100  \(\left [\begin {array}{c}0.57 \\[5pt] 0.43 \end {array}\right ]\)  \(\left [\begin {array}{r}0.08 \\[5pt] 0.08 \end {array}\right ]\) 
Many assoc. sites block 2 ^{c}  (0.2,0.7)  (0.01, 0.01)  Unif(0.1, 2)  100  \(\left [\begin {array}{c}0.57 \\[5pt] 0.43 \end {array}\right ]\)  \(\left [\begin {array}{r}0.08 \\[5pt] 0.08 \end {array}\right ]\) 
Scenario 1: DMSs have differences in both means and variances between cell types
In our first simulation scenario (Table 2), we chose to specify distinct differences in the strength and distribution of the methylation–phenotype associations (DMS) for the two cell types, with a binary phenotype. Differences in the DMS distributions include both direction of the effects as well as the amount of variability across sites and individuals; Additional file 1: Figure S1 displays histograms of the 500 simulated values of the DMS means μ _{ jk } for the two cell types, showing the substantial differences between these two distributions.
Performance metrics under simulation scenario 1 (distinct associations between cell types)
Method  Number of false positives  Power  KS  GIF  \(\hat {K}\) ^{a} 

Unadjusted  248  0.208  0.168  1.60  – 
Refbased  14  0.146  0.026  0.92  – 
Reffree  375  0.248  0.097  1.33  13 
SVA  32  0.124  0.021  1.05  10 
ISVA  14  0.134  0.004  0.97  12 
EWASher  3  0.036  0.044  0.92  – 
CellCDec  7  0.094  0.008  0.98  – 
Deconf  13  0.178  0.023  0.93  – 
RUV  36  0.170  0.018  1.04  43 
Since we know which 500 sites were generated to be truly DMSs, Table 3 reports both the power and the number of false positives (NFP). We declare a CpG site to be significant if the p value falls below the fixed value 10^{−4}. This table also shows a measure of performance based on the Kolmogorov–Smirnov (KS) test for whether the p value distribution matches the expected uniform distribution. Of course, the KS test assumes independence of all the individual tests, and therefore, we are not using this test for inference, but simply as a measure of deviation where smaller values imply less deviation.
For this simulation (scenario 1 with distinct association distributions in the two cell types), all methods except the referencefree method achieved a greater reduction in the NFP than the unadjusted analysis for the nonassociated probes. Although the power (sensitivity) for all the methods appears very low, many of the simulated effect sizes at the chosen DMSs were very small, and nevertheless the rankings of the different methods are still informative. Additional file 1: Figure S1 shows the simulated means of the celltype distributions for the 500 probes; subsequently, additional random errors were introduced at the level of each individual leading to substantial variability in the realized methylation differences. The power of most methods was slightly less than that of the unadjusted analysis, except for Reffree and EWASher. The power for EWASher is extremely poor; this method removes probes with very high or very low levels of methylation prior to constructing the components, and hence, many DMS probes are not even included in its analyses. Results for the Reffree power must be interpreted cautiously since the type 1 error is so substantially elevated for this method. The KS statistic confirms the conclusions obtained from other metrics, showing small values for most methods except for the unadjusted data and the Reffree method.
Scenario 2: no confounding
Performance metrics under simulation scenario 2 (no confounding)
Method  Number of false positives  Power  KS  GIF  \(\hat {K}\) 

Unadjusted  38  0.642  0.059  1.09  – 
Refbased  13  0.594  0.030  0.90  – 
Reffree  366  0.704  0.058  1.25  14 
SVA  79  0.646  0.015  1.06  10 
ISVA  408  0.654  0.061  1.27  15 
EWASher  0  0.126  0.066  0.83  – 
CellCDec  14  0.650  0.021  0.93  – 
Deconf  50  0.652  0.031  0.92  – 
RUV  134  0.640  0.007  1.00  38 
Scenario 3: opposite effects in different cell types
Performance metrics under simulation scenario 3 (opposite effects)
Method  Number of false positives  Power  KS  GIF  \(\hat {K}\) 

Unadjusted  5  0.490  0.041  0.97  – 
Refbased  1  0.506  0.036  0.87  – 
Reffree  176  0.608  0.053  1.19  14 
SVA  37  0.562  0.003  1.00  10 
ISVA  49  0.558  0.006  1.01  14 
EWASher  0  0.128  0.092  0.74  – 
CellCDec  3  0.502  0.033  0.88  – 
Deconf  2  0.522  0.035  0.89  – 
RUV  23  0.532  0.012  1.00  37 
Scenarios 4 and 5: altered precision simulations
Performance metrics under simulation scenario 4 (high precision)
Method  Number of false positives  Power  KS  GIF  \(\hat {K}\) 

Unadjusted  368  0.600  0.142  1.32  – 
Refbased  64  0.460  0.043  1.04  – 
Reffree  437  0.694  0.101  1.36  13 
SVA  79  0.534  0.032  1.09  11 
ISVA  198  0.522  0.062  1.20  14 
EWASher  16  0.072  0.038  0.94  – 
CellCDec  54  0.546  0.031  1.05  – 
Deconf  58  0.520  0.014  1.02  – 
RUV  155  0.568  0.051  1.15  32 
Performance metrics under simulation scenario 5 (low precision)
Method  Number of false positives  Power  KS  GIF  \(\hat {K}\) 

Unadjusted  127  0.622  0.241  1.47  – 
Refbased  354  0.654  0.198  1.49  – 
Reffree  596  0.670  0.145  1.48  12 
SVA  123  0.644  0.091  1.23  6 
ISVA  72  0.482  0.040  1.12  11 
EWASher  1  0.052  0.030  0.92  – 
CellCDec  169  0.598  0.008  1.27  – 
Deconf  184  0.644  0.204  1.41  – 
RUV  2943  0.654  0.228  1.93  33 
In the highprecision scenario, the NFP is extremely high when no adjustment is used. Most methods, however, perform quite well in reducing the GIF and KS statistics, reducing the NFP and retaining decent power (with the exceptions of Reffree and EWASher as seen previously). In contrast, for the lowprecision scenario, where there is much more variability from one individual to the next in the mixture proportions, as well as substantial differences between cases and controls, performance is generally poor. The QQ plots display substantial inflation, and most methods have very high NFP. Even the Refbased method has very high NFP, and notably the QQ plot for RUV has enormous inflation and with the NFP at 2943. In fact, the unadjusted analysis appears to be one of the better choices here, with lower NFP and good power; ISVA also seems to perform better than the others.
Scenario 6: continuous phenotype simulation results
Performance metrics under simulation scenario 6 (continuous)
Method  Number of false positives  Power  KS  GIF  \(\hat {K}\) 

Unadjusted  135  0.532  0.081  1.37  – 
Refbased  13  0.498  0.031  0.99  – 
Reffree  125  0.618  0.061  1.17  14 
SVA  29  0.502  0.006  0.99  10 
ISVA  114  0.520  0.076  1.21  14 
CellCDec  422  0.420  0.079  1.44  – 
Deconf ^{a}  –  –  –  –  – 
RUV  16  0.550  0.006  0.966  39 
Scenario 7: simulation with a small number of true DMSs
Unsurprisingly, the performance metrics are quite variable when the analysis does not adjust for celltype composition. The NFP of both the referencefree method and RUV vary substantially. In contrast, both the referencebased and SVA methods show a very good reduction in the number of nonDMSs below this p value threshold across all replications. Most methods do not achieve a power as high as the unadjusted method; however, the lowered NFP is a worthy tradeoff for the loss in power, especially for the referencebased and SVA methods. Most methods improve the KS statistic and GIF, except for the referencefree method, which has values almost as high as those in the unadjusted analysis.
Scenario 8: widespread, subtle, correlated DMS effects
In this scenario, NFP is less variable among the different adjustment methods, with the exception of the referencefree method. Once again, the referencebased and SVA methods do quite well, but Deconf and CellCDec also perform well in this scenario. Power is lower in all methods compared to the other simulation scenarios, with the exception of the referencefree method, though the high NFP for that method undermines this result. Most methods perform fairly similarly when considering the KS statistic and GIF.
Estimated latent dimension
Our simulation is based on complex mixtures of methylation profiles from two separated cell types. It is, therefore, interesting to note that for all the methods that provide estimates for the latent dimension (the last column of Tables 3, 4, 5, 6, 7 and 8), these estimates are consistently much larger than 2. Estimates are obtained for the Reffree, SVA, ISVA, and RUV methods. Both SVA and ISVA assume the number of surrogate variables is less than or equal to the number of true confounders whose linear space they span, and for RUV, the authors themselves commented that the estimated values for K do not necessarily reflect the true dimension [27]. All estimates are generally greater than ten, and RUV’s estimates tend to be over 30. In fact, there may be some additional sources of variation present in the original cellseparated methylation data, and these factors are likely being captured by these numerous latent variables. In fact, analyses of the original celltypeseparated data using patient age as the predictor resulted in estimated latent dimensions that were themselves large. For example, random matrix theory [31] (which is used for dimension estimation in the referencefree method and ISVA) estimated a latent dimension of ten for both T cells and monocytes when analyzed separately. Furthermore, SVA estimated the number of surrogate variables to be seven and nine for T cell and monocytes, respectively.
Results from analysis of the ARCTIC data set
We tested the performance of these eight adjustment methods on 450K measurements from the Assessment of Risk in Colorectal Tumors in Canada (ARCTIC) study [32], and the methylation data are deposited in dbGAP under accession number [phs000779.v1.p1]. We analyzed only 977 control subjects from this study, restricting to those where DNA methylation was measured on lymphocyte pellets, and examined the association between smoking (ever smoked) and methylation levels at all autosomal probes who passed quality control (473,864 probes). We excluded the colorectal cancer patients from this analysis due to concerns that their methylation profiles may have been affected by treatment. Patient age was included as a covariate in all analyses.
Performance metrics for the ARCTIC data with the most significant probes removed (top 5 %)
Method  KS statistic  GIF  \(\hat {K}\) 

Unadjusted  0.0758  0.9142  – 
Refbased  0.0508  0.9907  – 
Reffree  0.0296  1.1499  32 
SVA  0.0362  1.0962  15 
ISVA  0.1208  1.6820  39 
EWASher  0.7291  1.0164  – 
RUV with three components  0.0906  1.2423  – 
P values for sites previously found to be associated with smoking [33]
Site  Unadj  RefBased  Reffree  SVA  ISVA  EWASher ^{a}  RUV 

cg06644428  2.83E20  2.04E21  1.96E27  9.19E27  4.41E20  –  1.64E22 
cg05951221  2.72E43  1.21E45  5.41E56  3.96E58  3.20E45  5.77E01  1.05E53 
cg21566642  1.72E49  6.93E50  4.20E54  6.59E58  8.09E54  6.45E01  5.94E58 
cg01940273  2.09E35  3.30E38  4.07E43  7.07E46  6.93E42  3.07E01  1.45E44 
cg19859270  4.91E13  2.52E22  4.76E35  2.21E35  5.15E28  –  2.16E35 
cg05575921  1.58E35  7.79E39  7.89E52  5.65E41  1.34E40  –  3.43E41 
cg21161138  2.43E07  2.99E10  1.24E25  1.68E25  2.84E21  9.05E01  6.07E21 
cg06126421  1.78E23  1.06E33  7.38E34  7.37E33  9.69E34  7.05E01  2.55E36 
cg03636183  9.27E21  6.74E24  6.10E37  1.51E34  4.77E30  3.09E01  1.60E31 
Results from analysis of the rheumatoid arthritis data set
We also performed analysis of data from a rheumatoid arthritis study published in 2013 [34]. The data are available from GEO (http://www.ncbi.nlm.nih.gov/geo/). Methylation was measured with the Illumina 450K array in wholeblood samples from 354 anticitrullinated protein antibodyassociated rheumatoid arthritis cases and 337 controls. The manuscript reported 51,478 CpGs as demonstrating evidence of significant association with disease status. Akin to the ARCTIC analysis, we compared the results of running the different celltype adjustment methods on these data.
We attempted to replicate the original analysis as closely as possible by using the Illumina control probe scaling procedure, including the same covariates in our linear model (age, sex, and smoking status), and adjusting for celltype composition using the referencebased method. However, when extracting the significant CpGs mentioned in their Additional file 1, the distribution of raw p values in our analysis did not match those in the original paper. This could suggest there is a step in the original analysis not explicitly stated in the paper’s “Methods” section. However, since for our purposes, we wish to compare the results of the adjustment methods relative to each other, we do not believe this to be a significant issue. For our comparison, we have used functional normalization [35]. Comparisons of the distribution of the reported p values and the ones we obtained can be seen in Additional file 1: Figure S9.
Computational performance
To compare computational time across the different adjustment methods, we selected a random sample of 10,000 CpGs from the ARCTIC methylation matrix to create a benchmark data set. As we are not making any statistical inference here, all samples were included, regardless of whether we had matching celltype sets or quality control status. Some of the methods calculate p values and parameter estimates internally, and others require the use of an external function to perform a linear fit. Therefore, to make the computational times comparable, we define the start time as when the adjustment method is first called, and the end time when all estimates and p values have been obtained.
Discussion
We have presented an extensive comparison of eight different methods for adjusting for celltypemixture confounding, by designing a rich simulation based on celltypeseparated methylation data in SARD patients. Our simulation contained multiple levels of variability, between cell types, at the level of the probe means, and at the level of the individual. We found that there was no adjustment method whose performance was uniformly the best, and in fact in some of our scenarios, the unadjusted results were quite comparable to the best adjusted results. These general conclusions were similar whether we had a few DMSs of large effect (scenario 7) or many correlated DMSs of smaller effect size (scenario 8).
Ten replications were run in both simulation scenarios 7 and 8. However, for scenarios 1 through 6, the reported results were obtained from one run of the simulation for each scenario. There are two reasons for this: firstly, the computational time for some methods was very long, and hence, multiple simulations would be extremely timeconsuming. More fundamentally, however, since our metrics of performance are calculated from the distributions of behavior across all good quality probes in the 450K array, we obtained very similar results from repeated runs of our simulations, during the phase when we were designing the simulation setup and choosing parameter values.
In many of our simulated scenarios, as might be expected, the referencebased method performed well. This method is very easy to implement, and as seen in the computing performance section, it runs very quickly, even on larger sample sizes. It usually achieved good statistical power, and, with one exception, reduced NFP relative to the unadjusted model. It also has the advantage of being able to estimate directly the celltype composition of each sample. Therefore, the Refbased method is an obvious choice when a complete set of the required cellseparated methylation profiles are available; however, this is not always the case. For some tissues, cell types that are of particular interest are very difficult or impossible to extract; one example here would be the syncytiotrophoblast cells in placenta [36, 37].
In every case we examined, EWASher did a very good job in reducing p value inflation, and GIF values were substantially reduced from the unadjusted analyses. However, that this method so strictly forces the GIF factor downwards may raise concerns about overcorrection. If there were, for example, global hypermethylation associated with a disease, adjustment using EWASher would be overly conservative. Additionally, part of the algorithm involves filtering out loci that are unilaterally high or low among all subjects. The assumption behind this filter is that these loci are, for all intents and purposes, completely methylated or unmethylated and any associations between these probes and the phenotype are not interesting. This may be an overly strong assumption. In our simulations, this filter results in the dramatically worse power of this method, since we did not restrict the randomly selected DMS loci to any particular mean level of methylation. Furthermore, the EWASher method is quite difficult to implement. Although most methods can be run in R (https://www.rproject.org), EWASher requires the user to create three separate input files for a standalone executable, and then to perform postprocessing in R.
We cannot explain the poor performance in our simulations of the Reffree method. The NFP were almost always more inflated than in the raw data, and this inflation is clearly visible in the QQ plots. Furthermore, the implementation was somewhat more complex since the approach involved one step to estimate the latent dimension, a second to get parameter estimates, and then finally bootstrap calculations to obtain standard errors. The performance of the Reffree method was good for scenario 6 with a continuous phenotype, so we hypothesize that there are some linearity assumptions in the correction that are being violated in our binary phenotype simulations.
The performance of CellCDec and Deconf was generally quite good for binary phenotypes. The CellCDec method exists as a C++ program, and was quite easy to implement. The number of latent cell types must be specified in advance, which is a limitation. The run time was longer for this algorithm than the others, and increased quickly with the assumed number of cell types; in fact, we were unable to obtain results with the ARCTIC data. CellCDec does not use phenotype information; it would be interesting to see how this program performs if it took the phenotype and other covariates into account. For Deconf, the most important limitation was the running time. In all cases, it took longer to run than the other adjustment methods, and we were unable to obtain results for the ARCTIC data. The run time was sensitive to both increases in sample size and number of cell types. Akin to CellCDec, that it does not internally estimate the number of cell types is an issue.
The results for ISVA and RUV were often among the better ones with a couple of notable exceptions: NFP were extremely high for RUV in the lowprecision scenario, and for ISVA in the no confounding scenario. The computational time for the ISVA method also increased quite rapidly with sample size. RUV is very easy to run and is available as an R function. It contains a function to estimate the latent dimension (K), although, akin to the other methods that estimate K, the estimated dimension tends to be much higher than the simulated reality. We performed some investigations into how RUV performs at a range of values for K, and the best performance was observed, in most simulation scenarios, at smaller values such as K=3. Recently, Houseman also found that estimated latent dimensions obtained through random matrix theory may not be the best choices [38]. RUV is also extremely fast, slower only than the Refbased method, and as shown in Fig. 7, the computational time is essentially invariant as the latent dimension is varied, making this an attractive option. Nevertheless, the SVA method, although rarely the best, did not have any notable failures across our scenarios, and was easy to implement.
There are other methods for deconvolution that we did not examine, especially in the computer science and engineering literature [39]. However, it is not clear whether these methods would be easily adapted for use on methylation data. Also, new methods for DNA methylation analysis continue to be published, such as [40]. However, the spectrum of methods that we have examined includes the mostcommonly used approaches. All methods that we have examined assume approximately linear relationships between the phenotype and the methylation levels or covariates; however, this should not be an important limitation since approximate linearity should hold [38].
The latent dimension, when estimated, was rarely similar to the dimension of K=2 implemented in our simulation. However, these estimates of K capture aspects of heterogeneity in the data that are only partially attributable to the mixture of data from two cell types. This heterogeneity may also be partially due to technological artefacts from batch effects or experimental conditions, and in particular to the fact that subtler cell lineage differences will still be present even after cell sorting [38].
In summary, our simulation study comparing methods found a wide range of performance across our scenarios with notable failures of some methods in some situations. We recommend SVA as a safe approach for adjustment for a celltype mixture since it performed adequately in all simulations with reasonable computation time. In all situations, EWAS results are extremely sensitive to the normalization and celltype adjustment methods used, and hence, this issue should receive more attention when interpreting findings.
A set of scripts enabling implementation of all these methods can be found at https://github.com/GreenwoodLab/CellTypeAdjustment.
Conclusions
We have compared eight different methods for adjusting methylation data for celltypemixture confounding in a rich and multilayered simulation study, and in a large set of samples where methylation was measured in whole blood. No method performs best in all simulated scenarios, nevertheless we recommend SVA as a method that performed adequately without notable failures.
Methods
Patient data and quality checks
Ethical approval was obtained at the Jewish General Hospital and at McGill University, Montreal, QC, to obtain wholeblood samples from the patients with SARDs, at the time of initial diagnosis prior to any treatment. Cell purification and phenotyping protocols for cell subset isolation, analysis, purity evaluation, fractionation, and storage were standardized and optimized. Then 40 ml of peripheral blood were obtained from the above subjects and processed within 4 hours. PBMCs were separated with lymphocyte separation medium (Mediatech, Inc.). Isolated PBMCs were sequentially incubated with antiCD19, antiCD14, and antiCD4 microbeads (Miltenyi Biotec). Automated cell separation of specific cell subpopulations was performed with autoMACS using positive selection programs. An aliquot of the specific isolated cell subtypes was used for purity assessment with flow cytometric analysis. A minimum of 2 million cells from each subpopulation with a purity higher than 95 % were frozen in liquid nitrogen for the epigenomic studies. The optimized protocols required the isolation of sufficient numbers of CD4 ^{+} lymphocytes (9.04±4.03×10^{6}) and CD14 ^{+} monocytes (7.89±2.96×10^{6}), and CD19 ^{+} B lymphocytes (2.02±1.42×10^{6}), of sufficient purity to perform the epigenetic analyses.
The required number of cells with the right purity was not always available, especially for the CD19 ^{+} B lymphocytes, so we did not have all three cell types for all patients; for this reason, the simulation used only two cell types and 46 patients. Illumina Infinium HumanMethylation450 BeadChip data were normalized with funnorm [35]. Also, a number of probes were removed, specifically those on the sex chromosomes as well as probes close to singlenucleotide polymorphisms [41]. There were 375,639 probes remaining after filtering.
Details of the simulation method
 1.
Selection of DMS probes: S=500 probes were randomly selected to be associated with the phenotype.
 2.
Phenotype (z _{ i }, i=1,…,n): A random sample of size 46 was drawn from either a Bernoulli distribution (p=0.5) for a binary phenotype, or from a standard normal distribution for a continuous phenotype.
 3.
Celltypespecific methylation values for nonDMS probes: Let β _{ ijk } represent the true methylation value for individual i, probe j, and cell type k. For a probe that is not a DMS probe, the simulated value \(\beta ^{\prime }_{ijk} = \beta _{ijk}\).
 4.Celltypespecific methylation values for DMS probes:

Celltypespecific means are sampled from normal distributions with given parameters. That is, for chosen values μ _{ k } and σ _{ k } for k=1,2, celltypespecific means that for each DMS probe, μ _{ jk } are generated from \(\mu _{jk} \sim N\left (\mu _{k}, {\sigma ^{2}_{k}}\right)\). In scenario 8, we split the DMSs into two blocks and simulate effect sizes from a multivariable normal distribution with a fixed background correlation in each block. That is, probes in the same block are correlated, but probes across blocks are independent from one another. In this scenario, the parameters μ _{ k } for k=1,2 differ between the blocks.

The simulated celltypespecific methylation effect, ε _{ ijk }, at a DMS, for an individual sample i and an individual probe j, is another random quantity, so that$$ e_{ijk} \sim N\left(\mu_{jk}, \sigma^{2}_{jk}\right) $$
where σ _{ jk } is a parameter provided to the simulation.

For either a binary or continuous phenotype z _{ i }, the simulated methylation value β _{ ijk } is then$$ \beta^{\prime}_{ijk} = \text{logit}^{1} \left(\text{logit} \left(\beta_{ijk} \right) + z_{i} e_{ijk} \right). $$
Although all the random effects were simulated on a linear scale, the results are reconverted to the (0,1) scale since several of the celltype adjustment methods require this range.

 5.Combining across cell types:

Each individual is assumed to have a unique mixture of the two cell types in a way that depends on the phenotype, z. Let \(\alpha ^{(0)} = \left (\alpha _{1}^{(0)}, \alpha _{2}^{(0)}\right)^{\top }\) represent the average proportions of the two cell types when z _{ i }=0, and then let \(\alpha ^{(Z)} = \left (\alpha _{1}^{(Z)}, \alpha _{2}^{(Z)}\right)^{\top }\) be these proportions when z _{ i }=Z. We then say,$$ \begin{aligned} \alpha^{(Z)} &\!= \alpha^{(0)}+ Z \\ &\!\quad\times\!\! \left[\begin{array}{l} \text{\!average change in proportion in monocytes\!\!} \\ \text{\!average change in proportion in T cells}\! \end{array}\right]\!. \end{aligned} $$(1)

The celltype proportions p _{ ik } for individual i were then generated from \(\text {Dirichlet}\left (\rho \alpha _{k}^{(Z)}\right)\), where ρ>0 is a precision parameter, such that larger precision corresponds to less variation in the observed values.

The final simulated beta value for person i at CpG site j becomes$$ \beta^{f}_{ij} = p_{i1} \beta^{\prime}_{ij1} + p_{i2} \beta^{\prime}_{ij2}. $$(2)

Key notation definitions are summarized in Table 1, and parameter choices for the simulations are in Table 2.
Description of adjustment methods
Comparison of some features of the methods for celltype mixture adjustment
Method  Phen. allowed ^{a}  Input values ^{b}  K ^{c}  Link 

Refbased  Any  Beta  N/A  http://people.oregonstate.edu/~housemae/software/TutorialLondon2014 
http://bioconductor.org/packages/release/bioc/html/minfi.html  
Reffree  Any  Beta  Estimated  http://cran.rproject.org/web/packages/RefFreeEWAS/index.html 
SVA  Any  Beta or logit(beta)  Estimated  
ISVA  Continuous  Beta or logit(beta)  Estimated  
EWASher  Binary  Beta  Estimated  http://research.microsoft.com/enus/downloads/472fe6377cb947d4a0df37118760ccd1 
CellCDec  Not used  Beta  Input  
Deconf  Not used  Beta or logit(beta)  Input  
RUV  Any  Beta or logit(beta)  Estimated 
Referencebased
This method was published in 2012 by Houseman et al. [19]. It relies on the existence of a separate data set containing methylation measurements on separated cell types. The method uses methylation profiles for the individual cell types to estimate directly the celltype composition of each sample. However, cellseparated data are not always available for all constituent cell types.
Referencefree
The second method from Houseman et al. does not depend on a reference data set, and therefore, can be used in methylation studies on any tissue type [20]. Rather than directly estimating celltype composition, the referencefree method performs a singular value decomposition on the concatenation of the estimated coefficient and residual matrices from an initial, unadjusted model. A set of latent vectors is then obtained that accounts for cell type in further analyses.
Surrogate variable analysis
Surrogate variable analysis (SVA) is a popular method that was introduced by Leek and Storey in 2007 [23]. It was not specifically intended for use in methylation studies, but is nonetheless well suited for such analyses. SVA seeks a set of surrogate variables that span the same linear space as the unmeasured confounders (i.e. celltype proportions). It is based on a singular value decomposition on the residual matrix from a regression model not accounting for celltype composition. The total number of surrogate variables included in the model is based on a permutation test.
Independent surrogate variable analysis
Independent surrogate variable analysis (ISVA) from Teschendorff et al. [24] is very similar in principle to SVA. The main difference is that instead of applying singular value decomposition, it uses independent component analysis (ICA), which attempts to find a set of latent variables that are as statistically independent as possible.
FaSTLMMEWASher (EWASher)
This method from Zou et al. [22] extends the Factored Spectrally Transformed Linear Mixed Model algorithm (FaSTLMM) [43] for use in the context of EWAS. A similarity matrix is calculated based on the methylation profiles, and principal components are subsequently included in the linear mixed model until GIF is controlled. The maximum number of principal components allowed was fixed as ten.
Removing unwanted variation
The method called Removing Unwanted Variation (RUV) was published in 2012 [26] by GagnonBartsch and Speed. It performs a factor analysis on negative control probes to separate out variation due to unmeasured confounders, while leaving the variation due to the factors of interest intact. Here we use RUV4, an extension to the original published version, which uses elements from RUV as well as SVA [27]. Control probes were chosen from a list of 500 probes on the 450K platform known to be differentially methylated with blood cell type and age [13]. We selected probes that were not strongly correlated with the simulated phenotype.
Deconfounding
The Deconf method from Repsilber et al. [25] was developed for gene expression studies on heterogeneous tissue samples, but is applicable for use in EWAS. The algorithm performs a nonnegative matrix factorization on the methylation matrix, but does not consider the phenotype in correcting for the heterogeneity and does not estimate the number of cell types present.
CellCDec
CellCDec was developed by Wagner [21], and is similar to Deconf in that it does not consider the phenotype in performing its decomposition and does not internally estimate the number of cell types present. The method assumes a specific regression parameterization, and makes random perturbations to the model parameters, which are accepted if there is a decrease in the sum squared residuals.
Additional statistical details
For each of the simulation scenarios 1–6, the simulation was run once, whereas in scenarios 7 and 8, we performed ten replications. DMSs were chosen randomly in each simulation scenario and within each replication. After celltype adjustment, a linear model was performed including the latent variables as covariates (except in EWASher where the model was run within the function call). Standard errors were obtained after performing the empirical Bayes method eBayes from the limma package in R. Probes were declared significant if the p value fell below the fixed value 10^{−4}.
In the referencefree method, standard errors were obtained via the bootstrap procedure included in the method. To estimate standard errors, 100 bootstrap samples were generated. We performed test runs with higher numbers of bootstrap samples (500 and 1000), but did not find significant differences in the resulting p values.
For SVA, we used the “iteratively reweighted least squares” option; however, no control probes were specified. For RUV, the control probes were chosen from sites previously shown to be associated with blood cell type and age, but were not significantly associated with the phenotype of interest, a necessary condition for a probe to be used as a control in this method.
For each of the methods requiring a prespecified value of K, the latent dimension, we ran the methods multiple times with different values of K. We consistently found that performance did not change with increasing values of K and so when running the methods CellCDec and Deconf, the value of K was fixed at 3 in all scenarios to keep running times down. For the RUV method, we also noticed better performance when K was fixed at 3, despite that the method consistently estimated a much higher value for that parameter. All results shown for RUV were run with K fixed at 3.
None of the other adjustment methods had specific options or tuning parameters that needed to be provided by the user.
Compliance with ethical standards
Ethics committee approval for this study was obtained at McGill University and all subjects provided informed written consent to participate in the study. The Institutional Research Board (IRB) number is [A12 M83 12A]. This study complies with the Helsinki Declaration.
Availability of supporting data
 1.
ARCTIC data are in dbGAP under accession number [phs000779.v1.p1], http://www.ncbi.nlm.nih.gov/projects/gap/cgibin/study.cgi?study_id=phs000779.v1.p1.
 2.
The rheumatoid arthritis data are available under accession number [GSE42861], https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42861.
 3.
Data for simulation scenarios for celltype mixtures are available at Zenodo [10.5281/zenodo.46746], https://zenodo.org/record/46746%23.VtW8H2SAOko.
Declarations
Acknowledgments
Computations were made on the supercomputer Mammouth parallèle 2 from Université de Sherbrooke, managed by Calcul Québec and Compute Canada. The operation of this supercomputer is funded by the Canada Foundation for Innovation (CFI), NanoQuébec, RMGA, and the Fonds de recherche du Québec, Nature et technologies (FRQNT).
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Choy MK, Movassagh M, Goh HG, Bennett MR, Down TA, Foo RS. Genomewide conserved consensus transcription factor binding motifs are hypermethylated. BMC Genom. 2010; 11(1):519.View ArticleGoogle Scholar
 Rakyan V, Down T, Balding D, Beck S. Epigenomewide association studies for common human diseases. Nat Rev Genet. 2011; 12(8):529–41.View ArticlePubMedPubMed CentralGoogle Scholar
 Khavari DA, Sen GL, Rinn JL. DNA methylation and epigenetic control of cellular differentiation. Cell Cycle. 2010; 9(19):3880–3.View ArticlePubMedGoogle Scholar
 Meissner A, Mikkelsen TS, Gu H, Wernig M, Hanna J, Sivachenko A, et al. Genomescale DNA methylation maps of pluripotent and differentiated cells. Nature. 2008; 454(7205):766–70.PubMedPubMed CentralGoogle Scholar
 Bird A. DNA methylation patterns and epigenetic memory. Genes Dev. 2002; 16:6–21.View ArticlePubMedGoogle Scholar
 Laird P. Principles and challenges of genomewide DNA methylation analysis. Nat Rev Genet. 2010; 11:191–203.View ArticlePubMedGoogle Scholar
 Farid N. The immunogenetics of autoimmune diseases. Boca Raton, FL: CRC Press; 1991.Google Scholar
 Papp G, Horvath I, Barath S, Gyimesi E, Spika S, Szodoray P, et al. Altered Tcell and regulatory cell repertoire in patients with diffuse cutaneous systems sclerosis. Scand J Rheumatol. 2011; 40:205–10.View ArticlePubMedGoogle Scholar
 Gambichler T, Tigges C, Burkert B, Hoxtermann S, Altmeyer P, Kreuter A. Absolute count of T and B lymphocyte subsets is decreased in systemic sclerosis. Eur J Med Res. 2010; 15:44–6.View ArticlePubMedPubMed CentralGoogle Scholar
 Wagner D, Kaltenhauser S, Pierer M, Wilke B, Arnold S, Hantzschel H. B lymphocytopenia in rheumatoid arthritis is associated with the DRB1 shared epitope and increased acute phase response. Arthritis Res. 2002; 4(4):R1.View ArticlePubMedPubMed CentralGoogle Scholar
 Manda G, Neagu M, Livescu A, Constantin C, Codreanu C, Radulescu A. Imbalance of peripheral B lymphocytes and NK cells in rheumatoid arthritis. J Cell Mol Med. 2003; 7(1):79–88.View ArticlePubMedGoogle Scholar
 Scott D, Wolfe F, Huizinga T. Rheumatoid arthritis. Lancet. 2010; 376(9746):1094–108.View ArticlePubMedGoogle Scholar
 Jaffe A, Irizarry R. Accounting for cellular heterogeneity is critical in epigenomewide association studies. Genome Biol. 2014; 15:R31.View ArticlePubMedPubMed CentralGoogle Scholar
 Reinius L, Acavedo N, Joerink M, Pershagen G, Dahlen SE, Greco D, et al. Differential DNA methylation in purified human blood cells: implications for cell lineage and studies on disease susceptibility. PLoS One. 2012; 7(7):e41361.View ArticlePubMedPubMed CentralGoogle Scholar
 Gu H, Bock C, Mikkelsen T, Jager N, Smoth Z, Tomazou E, et al. Genomescale DNA methylation mapping of clinical samples at singlenucleotide resolution. Nat Methods. 2010; 7:133–6.View ArticlePubMedPubMed CentralGoogle Scholar
 Liang L, Cookson W. Grasping nettles: cellular heterogeneity and other confounders in epigenomewide association studies. Hum Mol Genet. 2014; 21(R1):83–8.View ArticleGoogle Scholar
 Liu Y, Aryee M, Padyukov L, Fallin M, Hesselberg E, Runarsson A, et al. Epigenomewide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis. Nat Biotechnol. 2013; 31(2):142–8.View ArticlePubMedPubMed CentralGoogle Scholar
 Michels K, Binder A, Dedeurwaerder S, Epstein C, Greally J, Gut I, et al. Recommendations for the design and analysis of epigenomewide association studies. Nat Methods. 2013; 10(10):940–55.View ArticleGoogle Scholar
 Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinform. 2012; 13(1):86.View ArticleGoogle Scholar
 Houseman EA, Molitor J, Marsit CJ. Referencefree cell mixture adjustments in analysis of DNA methylation data. Bioinformatics. 2014; 30(10):1431–9.View ArticlePubMedPubMed CentralGoogle Scholar
 Wagner J. Computational approaches for the study of gene expression, genetic and epigenetic variation in human. Montreal, QC: McGill University School of Computer Science; 2015.Google Scholar
 Zou J, Lippert C, Heckerman D, Aryee M, Listgarten J. Epigenomewide association studies without the need for celltype composition. Nat Methods. 2014; 11(3):309–11.View ArticlePubMedGoogle Scholar
 Leek JT, Storey JD. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 2007; 3(9):e161.View ArticlePubMed CentralGoogle Scholar
 Teschendorff AE, Zhuang J, Widschwendter M. Independent surrogate variable analysis to deconvolve confounding factors in largescale microarray profiling studies. Bioinformatics. 2011; 27(11):1496–505.View ArticlePubMedGoogle Scholar
 Repsilber D, Kern S, Telaar A, Walzl G, Black GF, Selbig J, et al. Biomarker discovery in heterogeneous tissue samplestaking the insilico deconfounding approach. BMC Bioinform. 2010; 11(1):27.View ArticleGoogle Scholar
 GagnonBartsch JA, Speed TP. Using control genes to correct for unwanted variation in microarray data. Biostatistics. 2012; 13(3):539–52.View ArticlePubMedPubMed CentralGoogle Scholar
 GagnonBartsch JA, Jacob L, Speed TP. Removing unwanted variation from high dimensional data with negative controls. Berkeley: Department of Statistics. University of California; 2013.Google Scholar
 Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999; 55:997–1004.View ArticlePubMedGoogle Scholar
 Harris RA, NagySzakal D, Kellermayer R. Human metastable epiallele candidates link to common disorders. Epigenetics. 2013; 8(2):157–63.View ArticlePubMedPubMed CentralGoogle Scholar
 Silver MJ, Kessler NJ, Hennig BJ, DominguezSalas P, Laritsky E, Baker MS, et al. Independent genomewide screens identify the tumor suppressor VTRNA21 as a human epiallele responsive to periconceptional environment. Genome Biol. 2015; 16(1):118.View ArticlePubMedPubMed CentralGoogle Scholar
 Plerou V, Gopikrishnan P, Rosenow B, Amaral LAN, Guhr T, Stanley HE. Random matrix approach to cross correlations in financial data. Phys Rev E. 2002; 65(6):066126.View ArticleGoogle Scholar
 Zanke BW, Greenwood CM, Rangrej J, Kustra R, Tenesa A, Farrington SM, et al.Genomewide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24. Nat Genet. 2007; 39(8):989–94.View ArticlePubMedGoogle Scholar
 Tsaprouni LG, Yang TP, Bell J, Dick KJ, Kanoni S, Nisbet J, et al. Cigarette smoking reduces DNA methylation levels at multiple genomic loci but the effect is partially reversible upon cessation. Epigenetics. 2014; 9(10):1382–96.View ArticlePubMedPubMed CentralGoogle Scholar
 Liu Y, Aryee MJ, Padyukov L, Fallin MD, Hesselberg E, Runarsson A, et al. Epigenomewide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis. Nat Biotechnol. 2013; 31(2):142–7.View ArticlePubMedPubMed CentralGoogle Scholar
 Fortin JP, Labbe A, Lemire M, Zanke BW, Hudson TJ, Fertig EJ, et al. Functional normalization of 450k methylation array data improves replication in large cancer studies. Genome Biol. 2014; 15(11):503.View ArticlePubMedPubMed CentralGoogle Scholar
 Le Bellego F, Vaillancourt C, Lafond J, Vol. 550. Human Embryogensis: Methods and Protocols, Book chapter: 4. Methods in Molecular Biology.Springer; 2009, pp. 73–87.Google Scholar
 Kaspi T, Nebel L. Isolation of syncytiotrophoblasts from human term placenta. Obstet Gynecol. 1974; 43:549–57.PubMedGoogle Scholar
 Houseman EA, Kelsy KT, Wiencke KJ, Marsit CJ. Cellcomposition effects in the analysis of DNA methylation array data: a mathematical perspective. BMC Bioinform. 2015; 16:95.View ArticleGoogle Scholar
 Yadav V, De S. An assessment of computational methods for estimating purity and clonality using genomic data derived from heterogeneous tumor tissue samples. Brief Bioinform. 2014; 16(2):232Ű241.Google Scholar
 Jones MJ, Islam SA, Edgar RD, Kobor MS. Adjusting for Cell Type Composition in DNA Methylation Data Using a RegressionBased Approach. Totowa, NJ: Humana Press, pp. 1–8.Google Scholar
 Busche S, Ge B, Vidal R, Spinella JF, Saillour V, Richer C, Healy J, Chen SH, Droit A, Sinnett D, Pastinen T. Integration of HighResolution Methylome and Transcriptome Analyses to Dissect Epigenomic Changes in Childhood Acute Lymphoblastic Leukemia. Cancer Res; 73(14):4323–36.Google Scholar
 McGregor K. Methods for estimating changes in DNA methylation in the presence of cell type heterogeneity. Montreal, QC: McGill University Department of Epidemiology, Biostatistics, and Occupational Health; 2015.Google Scholar
 Lippert C, Listgarten J, Liu Y, Kadie CM, Davidson RI, Heckerman D. FaST linear mixed models for genomewide association studies. Nat Methods. 2011; 8(10):833–5.View ArticlePubMedGoogle Scholar