methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles
© Akalin et al.; licensee BioMed Central Ltd. 2012
Received: 30 April 2012
Accepted: 3 October 2012
Published: 3 October 2012
DNA methylation is a chemical modification of cytosine bases that is pivotal for gene regulation,cellular specification and cancer development. Here, we describe an R package, methylKit, thatrapidly analyzes genome-wide cytosine epigenetic profiles from high-throughput methylation andhydroxymethylation sequencing experiments. methylKit includes functions for clustering, samplequality visualization, differential methylation analysis and annotation features, thus automatingand simplifying many of the steps for discerning statistically significant bases or regions of DNAmethylation. Finally, we demonstrate methylKit on breast cancer data, in which we find statisticallysignificant regions of differential methylation and stratify tumor subtypes. methylKit is availableat http://code.google.com/p/methylkit.
DNA methylation is a critical epigenetic modification that guides development, cellulardifferentiation and the manifestation of some cancers [1, 2]. Specifically, cytosine methylation is a widespread modification in the genome, and itmost often occurs in CpG dinucleotides, although non-CpG cytosines are also methylated in certaintissues such as embryonic stem cells . DNA methylation is one of the many epigenetic control mechanisms associated with generegulation. Specifically, cytosine methylation can directly hinder binding of transcription factorsand methylated bases can also be bound by methyl-binding-domain proteins that recruitchromatin-remodeling factors [4, 5]. In addition, aberrant DNA methylation patterns have been observed in many humanmalignancies and can also be used to define the severity of leukemia subtypes . In malignant tissues, DNA is either hypo-methylated or hyper-methylated compared to thenormal tissue. The location of hyper- and hypo-methylated sites gives distinct signatures withinmany diseases . Often, hypomethylation is associated with gene activation and hypermethylation isassociated with gene repression, although there are many exceptions to this trend . DNA methylation is also involved in genomic imprinting, where the methylation state of agene is inherited from the parents, but de novo methylation also can occur in the earlystages of development [8, 9].
A common technique for measuring DNA methylation is bisulfite sequencing, which has the advantageof providing single-base, quantitative cytosine methylation levels. In this technique, DNA istreated with sodium bisulfite, which deaminates cytosine residues to uracil, but leaves5-methylcytosine residues unaffected. Single-base resolution, %methylation levels are thencalculated by counting the ratio of C/(C+T) at each base. There are multiple techniques thatleverage high-throughput bisulfite sequencing such as: reduced representation bisulfite sequencing (RRBS) and its variants , whole-genome shotgun bisulfite sequencing (BS-seq) , methylC-Seq , and target capture bisulfite sequencing . In addition, 5-hydroxymethylcytosine (5hmC) levels can be measured through amodification of bisulfite sequencing techniques .
Flexible data integration and regional analysis
Sample text file that can be read by methylKit.
Most bisulfite experiments have a set of test and control samples or samples across multipleconditions, and methylKit can read and store (in memory) methylation data simultaneouslyfor N-experiments, limited only by memory of the node or computer. The default setting of theprocessing algorithm requires that there be least 10 reads covering a base and each of the basescovering the genomic base position have at least 20 PHRED quality score. Also, since DNA methylationcan occur in CpG, CHG and CHH contexts (H = A, T, or C) , users of methylKit have the option to provide methylation information for allthese contexts: CpG, CHG and CHH from SAM files.
Summarizing DNA methylation information over pre-defined regions or tiling windows
Although base-pair resolution DNA methylation information is obtained through most bisulfitesequencing experiments, it might be desirable to summarize methylation information over tilingwindows or over a set of predefined regions (promoters, CpG islands, introns, and so on). Forexample, Smith et al.  investigated methylation profiles with RRBS experiments on gametes and zygote andsummarized methylation information on 100bp tiles across the genome. Their analysis revealed aunique set of differentially methylated regions maintained in early embryo. Using tiling windows orpredefined regions, such as promoters or CpG islands, is desirable when there is not enoughcoverage, when bases in close proximity will have similar methylation profiles, or where methylationproperties of a region as a whole determines its function. In accordance with these potentialanalytic foci, methylKit provides functionality to do either analysis on tiling windowsacross the genome or predefined regions of the genome. After reading the base pair methylationinformation, users can summarize the methylation information on pre-defined regions they select oron tiling windows covering the genome (parameter for tiles are user provided). Then, subsequentanalyses, such as clustering or differential methylation analysis, can be carried out with the samefunctions that are used for base pair resolution analysis.
Example methylation data set: breast cancer cell lines
We demonstrated the capabilities of methylKit using an example data set from sevenbreast cancer cell lines from Sun et al. . Four of the cell lines express estrogen receptor-alpha (MCF7, T47D, BT474, ZR75-1), andfrom here on are referred to as ER+. The other three cell lines (BT20, MDA-MB-231, MDA-MB-468) donot express estrogen receptor-alpha, and from here on are referred to as ER-. It has been previouslyshown that ER+ and ER- tumor samples have divergent gene expression profiles and that those profilesare associated with disease outcome [24, 25]. Methylation profiles of these cell lines were measured using reduced RRBS . The R objects contained the methylation information for breast cancer cell lines andfunctions that produce plots and other results that are shown in the remainder of this manuscriptare in Additional file 4.
Whole methylome characterization: descriptive statistics, sample correlation and clustering
Descriptive statistics on DNA methylation profiles
Measuring and visualizing similarity between samples
Hierarchical clustering of samples
Principal component analysis of samples
methylKit can be used to perform Principal Component Analysis (PCA) on the samples'%-methylation profiles (see for example ). PCA can reduce the high dimensionality of a data set by transforming the large numberof regions to a few principal components. The principal components are ordered so that the first fewretain most of the variation present in the original data and are often used to emphasize groupingstructure in the data. For example, a plot of the first two or three principal components couldpotentially reveal a biologically meaningful clustering of the samples. Before the PCA is performed,a new data matrix is formed, containing the samples and only those cytosines that are covered in allsamples. After PCA, methylKit then returns to the user a 'prcomp' object, which can be usedto extract and plot the principal components. We found that in the breast cancer data set, PCAreveals a similar clustering to the hierarchical clustering where MDMB231 is an outlier.
Differential methylation calculation
Parallelized methods for detecting significant methylation changes
Differential methylation patterns have been previously described in malignancies [27–29] and can be used to differentiate cancer and normal cells . In addition, normal human tissues harbor unique DNA methylation profiles . Differential DNA methylation is usually calculated by comparing methylation levelsbetween multiple conditions, which can reveal important locations of divergent changes between atest and a control set. We have designed methylKit to implement two main methods fordetermining differential methylation across all regions: logistic regression and Fisher's exacttest. However, the data frames in methylKit can easily be used with other statistical testsand an example is shown in Additional file 4 (using a moderated t-test,although we maintain that most natural tests for this kind of data are Fisher's exact and logisticregression based tests). For our example data set we compared ER+ to ER- samples, with our 'controlgroup' being the ER- set.
Method #1: logistic regression
where Ti denotes the treatment indicator for sample i, Ti = 1 if sample iis in the treatment group and Ti = 0 if sample i is in control group. The parameterβ0 denotes the log odds of the control group and β1 the logoddsratio between the treatment and control group. Therefore, independent tests for all thebases/regions of interest are against the null hypothesis H0: β1= 0. Ifthe null hypothesis is rejected it implies that the logodds (and hence the methylation proportions)are different between the treatment and the control group and the base/region would subsequently beclassified as a differentially methylated cytosine (DMC) or region (DMR). However, if the nullhypothesis is not rejected it implies no statistically significant difference in methylation betweenthe two groups. One important consideration in logistic regression is the sample size and in manybiological experiments the number of biological samples in each group can be quite small. However,it is important to keep in mind that the relevant sample sizes in logistic regression are not merelythe number of biological samples but rather the total read coverages summed over all samples in eachgroup separately. For our example dataset, we used bases with at least 10 reads coverage for eachbiological sample and we advise (at least) the same for other users to improve power to detectDMCs/DMRs.
where Covariate1,i, ..., CovariateK,i denote K measured covariates(continuous or categorical) for sample i = 1,...,n and α1,..., αkdenote the corresponding parameters.
Method #2: Fisher's exact test
The Fisher's exact test compares the fraction of methylated Cs in test and control samples in theabsence of replicates. The main advantage of logistic regression over Fisher's exact test is that itallows for the inclusion of sample specific covariates (continuous or categorical) and the abilityto adjust for confounding variables. In practice, the number of samples per group will determinewhich of the two methods will be used (logistic regression or Fisher's exact test). If there aremultiple samples per group, methylKit will employ the logistic regression test. Otherwise,when there is one sample per group, Fisher's exact test will be used.
Following the differential methylation test and calculation of P-values, methylKitwill use the sliding linear model (SLIM) method to correct P-values to q-values , which corrects for the problem of multiple hypothesis testing [32, 33]. However, we also implemented the standard false discovery rate (FDR)-based method(Benjamini-Hochberg) as an option for P-value correction, which is faster but moreconservative. Finally, methylKit can use multi-threading so that differential methylationcalculations can be parallelized over multiple cores and be completed faster.
Extraction and visualization of differential methylation events
We have designed methylKit to allow a user to specify the parameters that define theDMCs/DMRs based on: q-value, %methylation difference, and type of differential methylation(hypo-/hyper-). By default, it will extract bases/regions with a q-value <0.01 and %methylationdifference >25%. These defaults can easily be changed when calling get.methylDiff()function. In addition, users can specify if they want hyper-methylated bases/regions(bases/regions with higher methylation compared to control samples) or hypo-methylated bases/regions(bases/regions with lower methylation compared to control samples). In the literature, hyper- orhypo-methylated DMCs/DMRs are usually defined relative to a control group. In our examples, and inmethylKit in general, a control group is defined when creating the objects through suppliedtreatment vector, and hyper-/hypomethylation definitions are based on that control group.
Annotating differential methylation events
Annotation with gene models and CpG islands
Annotation with custom regions
As with most genome-wide assays, the regions of interest for DNA methylation analysis may bequite numerous. For example, several reports show that Alu elements are aberrantly methylated incancers [35, 36] and enhancers are also differentially methylated [37, 38]. Since users may need to focus on specific genomic regions and require customizedannotation for capturing differential DNA methylation events, methylKit can annotatedifferential methylation events using user-supplied regions. As an example, we identifieddifferentially methylated bases of ER+ and ER- cells that overlap with ENCODE enhancer regions , and we found a large proportion of differentially methylated CpGs overlapping with theenhancer marks, and then plotted them with methylKit (Figure 6d).
Analyzing 5-hydroxymethylcytosine data with methylKit
5-Hydroxymethylcytosine is a base modification associated with pluropotency, hematopoiesis andcertain brain tissues (reviewed in ). It is possible to measure base-pair resolution 5hmC levels using variations oftraditional bisulfite sequencing. Recently, Yu et al.  and Booth et al.  published similar methods for detecting 5hmC levels in base-pair resolution. Both methodsrequire measuring 5hmC and 5mC levels simultaneously and use 5hmC levels as a substrate to deducereal 5mC levels, since traditional bisulfite sequencing cannot distinguish between the two . However, both the 5hmC and 5mC data generated by these protocols are bisulfitesequencing based, and the alignments and text files of 5hmC levels can be used directly inmethylKit. Furthermore, methylKit has an adjust.methylC() function toadjust 5mC levels based on 5hmC levels as described in Booth et al. .
Customizing analysis with convenience functions
methylKit is dependent on Bioconductor  packages such as GenomicRanges and its objects are coercible toGenomicRanges objects and regular R data structures such as data frames via providedconvenience functions. That means users can integrate methylKit objects to otherBioconductor and R packages and customize the analysis according to their needs or extend theanalysis further by using other packages available in R.
Methods for detecting methylation across the genome are widely used in research laboratories, andthey are also a substantial component of the National Institutes of Health's (NIH's) EpiGenomeroadmap and upcoming projects such as BLUEPRINT . Thus, tools and techniques that enable researchers to process and utilize genome-widemethylation data in an easy and fast manner will be of critical utility.
Here, we show a large set of tools and cross-sample analysis algorithms built intomethylKit, our open-source, multi-threaded R package that can be used for any base-leveldataset of DNA methylation or base modifications, including 5hmC. We demonstrate its utility withbreast cancer RRBS samples, provide test data sets, and also provide extensive documentation withthe release.
differentially methylated cytosine
differentially methylated region
estrogen receptor alpha
false discovery rate
principal component analysis
polymerase chain reaction
reduced representation bisulfite sequencing
transcription start site.
We wish to acknowledge the invaluable contribution of the WCMC Epigenomics Core Facility. MEF issupported by the Leukemia & Lymphoma Society Special Fellow Award and a Doris Duke ClinicalScientist Development Award. FGB is supported by a Sass Foundation Judah Folkman Fellowship. AM issupported by an LLS SCOR grant (7132-08) and a Burroughs Wellcome Clinical Translational ScientistAward. AM and CEM are supported by a Starr Cancer Consortium grant (I4-A442). CEM is supported bythe National Institutes of Health (I4-A411, I4-A442, and 1R01NS076465-01).
- Deaton AM, Bird A: CpG islands and the regulation of transcription. Genes Dev. 2011, 25: 1010-2210. 10.1101/gad.2037511.PubMedPubMed CentralView ArticleGoogle Scholar
- Suzuki MM, Bird A: DNA methylation landscapes: provocative insights from epigenomics. Nat Rev Genet. 2008, 9: 465-476.PubMedView ArticleGoogle Scholar
- Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo Q-M, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR: Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009, 462: 315-322. 10.1038/nature08514.PubMedPubMed CentralView ArticleGoogle Scholar
- Bird AP, Wolffe AP: Methylation-induced repression--belts, braces, and chromatin. Cell. 1999, 99: 451-454. 10.1016/S0092-8674(00)81532-9.PubMedView ArticleGoogle Scholar
- Hendrich B, Bird A: Identification and characterization of a family of mammalian methyl-CpG binding proteins. Mol Cell Biol. 1998, 18: 6538-6547.PubMedPubMed CentralView ArticleGoogle Scholar
- Figueroa ME, Abdel-Wahab O, Lu C, Ward PS, Patel J, Shih A, Li Y, Bhagwat N, Vasanthakumar A, Fernandez HF, Tallman MS, Sun Z, Wolniak K, Peeters JK, Liu W, Choe SE, Fantin VR, Paietta E, Löwenberg B, Licht JD, Godley LA, Delwel R, Valk PJM, Thompson CB, Levine RL, Melnick A: Leukemic IDH1 and IDH2 mutations result in a hypermethylation phenotype, disrupt TET2 function,and impair hematopoietic differentiation. Cancer Cell. 2010, 18: 553-567. 10.1016/j.ccr.2010.11.015.PubMedPubMed CentralView ArticleGoogle Scholar
- Fernandez AF, Assenov Y, Martin-Subero JI, Balint B, Siebert R, Taniguchi H, Yamamoto H, Hidalgo M, Tan A-C, Galm O, Ferrer I, Sanchez-Cespedes M, Villanueva A, Carmona J, Sanchez-Mut JV, Berdasco M, Moreno V, Capella G, Monk D, Ballestar E, Ropero S, Martinez R, Sanchez-Carbayo M, Prosper F, Agirre X, Fraga MF, Graña O, Perez-Jurado L, Mora J, Puig S, et al: A DNA methylation fingerprint of 1628 human samples. Genome Res. 2012, 22: 407-419. 10.1101/gr.119867.110.PubMedPubMed CentralView ArticleGoogle Scholar
- Li E, Beard C: Role for DNA methylation in genomic imprinting. Nature. 1993, 366: 362-365. 10.1038/366362a0.PubMedView ArticleGoogle Scholar
- Smith ZD, Chan MM, Mikkelsen TS, Gu H, Gnirke A, Regev A, Meissner A: A unique regulatory phase of DNA methylation in the early mammalian embryo. Nature. 2012, 484: 339-344. 10.1038/nature10960.PubMedPubMed CentralView ArticleGoogle Scholar
- Meissner A, Mikkelsen TS, Gu H, Wernig M, Hanna J, Sivachenko A, Zhang X, Bernstein BE, Nusbaum C, Jaffe DB, Gnirke A, Jaenisch R, Lander ES: Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature. 2008, 454: 766-770.PubMedPubMed CentralGoogle Scholar
- Akalin A, Garrett-Bakelman FE, Kormaksson M, Busuttil J, Zhang L, Khrebtukova I, Milne TA, Huang Y, Biswas D, Hess JL, Allis D, Roeder RG, Valk PJM, Lo B, Paietta E, Tallman MS, Schroth GP, Mason CE, Melnick A, Figueroa ME: Base-pair resolution DNA methylation sequencing reveals profoundly divergent epigeneticlandscapes in acute myeloid leukemia. PLoS Genet. 2012, 8: e1002781-10.1371/journal.pgen.1002781.PubMedPubMed CentralView ArticleGoogle Scholar
- Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD, Pradhan S, Nelson SF, Pellegrini M, Jacobsen SE: Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature. 2008, 452: 215-219. 10.1038/nature06745.PubMedPubMed CentralView ArticleGoogle Scholar
- Lister R, O'Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, Ecker JR: Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell. 2008, 133: 523-536. 10.1016/j.cell.2008.03.029.PubMedPubMed CentralView ArticleGoogle Scholar
- Ball MP, Li JB, Gao Y, Lee J-H, LeProust EM, Park I-H, Xie B, Daley GQ, Church GM: Targeted and genome-scale strategies reveal gene-body methylation signatures in human cells. Nat Biotechnol. 2009, 27: 361-368. 10.1038/nbt.1533.PubMedPubMed CentralView ArticleGoogle Scholar
- Booth MJ, Branco MR, Ficz G, Oxley D, Krueger F, Reik W, Balasubramanian S: Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-baseresolution. Science. 2012, 336: 934-937. 10.1126/science.1220671.PubMedView ArticleGoogle Scholar
- methylKit. [http://code.google.com/p/methylkit]
- Flusberg BA, Webster DR, Lee JH, Travers KJ, Olivares EC, Clark TA, Korlach J, Turner SW: Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat Methods. 2010, 7: 461-465. 10.1038/nmeth.1459.PubMedPubMed CentralView ArticleGoogle Scholar
- Cherf GM, Lieberman KR, Rashid H, Lam CE, Karplus K, Akeson M: Automated forward and reverse ratcheting of DNA in a nanopore at 5-Å precision. Nat Biotechnol. 2012, 30: 344-348. 10.1038/nbt.2147.PubMedPubMed CentralView ArticleGoogle Scholar
- Frith MC, Mori R, Asai K: A mostly traditional approach improves alignment of bisulfite-converted DNA. Nucleic Acids Res. 2012, 40: e100-10.1093/nar/gks275.PubMedPubMed CentralView ArticleGoogle Scholar
- Krueger F, Kreck B, Franke A, Andrews SR: DNA methylome analysis using short bisulfite sequencing data. Nat Methods. 2012, 9: 145-151. 10.1038/nmeth.1828.PubMedView ArticleGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009, 25: 2078-2079. 10.1093/bioinformatics/btp352.PubMedPubMed CentralView ArticleGoogle Scholar
- Krueger F, Andrews SR: Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics. 2011, 27: 1571-1572. 10.1093/bioinformatics/btr167.PubMedPubMed CentralView ArticleGoogle Scholar
- Sun Z, Asmann YW, Kalari KR, Bot B, Eckel-Passow JE, Baker TR, Carr JM, Khrebtukova I, Luo S, Zhang L, Schroth GP, Perez EA, Thompson EA: Integrated analysis of gene expression, CpG island methylation, and gene copy number in breastcancer cells by deep sequencing. PloS One. 2011, 6: e17490-10.1371/journal.pone.0017490.PubMedPubMed CentralView ArticleGoogle Scholar
- van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AAM, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH: Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002, 415: 530-536. 10.1038/415530a.PubMedView ArticleGoogle Scholar
- Sotiriou C, Neo S-Y, McShane LM, Korn EL, Long PM, Jazaeri A, Martiat P, Fox SB, Harris AL, Liu ET: Breast cancer classification and prognosis based on gene expression profiles from apopulation-based study. Proc Natl Acad Sci USA. 2003, 100: 10393-10398. 10.1073/pnas.1732912100.PubMedPubMed CentralView ArticleGoogle Scholar
- Joliffe I: Principal Component Analysis. 2002, New York, USA, Springer, 2Google Scholar
- Esteller M, Corn PG, Baylin SB, Herman JG: A gene hypermethylation profile of human cancer. Cancer Res. 2001, 61: 3225-3229.PubMedGoogle Scholar
- Baylin SB, Herman JG: DNA hypermethylation in tumorigenesis: epigenetics joins genetics. Trends Genet. 2000, 16: 168-174. 10.1016/S0168-9525(99)01971-X.PubMedView ArticleGoogle Scholar
- Costello JF, Frühwald MC, Smiraglia DJ, Rush LJ, Robertson GP, Gao X, Wright FA, Feramisco JD, Peltomäki P, Lang JC, Schuller DE, Yu L, Bloomfield CD, Caligiuri MA, Yates A, Nishikawa R, Su Huang H, Petrelli NJ, Zhang X, O'Dorisio MS, Held WA, Cavenee WK, Plass C: Aberrant CpG-island methylation has non-random and tumour-type-specific patterns. Nat Genet. 2000, 24: 132-138. 10.1038/72785.PubMedView ArticleGoogle Scholar
- Doi A, Park I-H, Wen B, Murakami P, Aryee MJ, Irizarry R, Herb B, Ladd-Acosta C, Rho J, Loewer S, Miller J, Schlaeger T, Daley GQ, Feinberg AP: Differential methylation of tissue- and cancer-specific CpG island shores distinguishes humaninduced pluripotent stem cells, embryonic stem cells and fibroblasts. Nat Genet. 2009, 41: 1350-1353. 10.1038/ng.471.PubMedPubMed CentralView ArticleGoogle Scholar
- Wang H-Q, Tuominen LK, Tsai C-J: SLIM: a sliding linear model for estimating the proportion of true null hypotheses in datasetswith dependence structures. Bioinformatics. 2011, 27: 225-231. 10.1093/bioinformatics/btq650.PubMedView ArticleGoogle Scholar
- Storey J: A direct approach to false discovery rates. J R Stat Soc Series B Stat Methodol. 2002, 64: 479-498. 10.1111/1467-9868.00346.View ArticleGoogle Scholar
- Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proc Natl Acad Sci USA. 2003, 100: 9440-9445. 10.1073/pnas.1530509100.PubMedPubMed CentralView ArticleGoogle Scholar
- Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, Wen B, Wu H, Liu Y, Diep D, Briem E, Zhang K, Irizarry R a, Feinberg AP: Increased methylation variation in epigenetic domains across cancer types. Nat Genet. 2011, 43: 768-775. 10.1038/ng.865.PubMedPubMed CentralView ArticleGoogle Scholar
- Ehrlich M: DNA hypomethylation in cancer cells. Epigenomics. 2009, 1: 239-259. 10.2217/epi.09.33.PubMedPubMed CentralView ArticleGoogle Scholar
- Rodriguez J, Vives L, Jordà M, Morales C, Muñoz M, Vendrell E, Peinado MA: Genome-wide tracking of unmethylated DNA Alu repeats in normal and cancer cells. Nucleic Acids Res. 2008, 36: 770-784.PubMedPubMed CentralView ArticleGoogle Scholar
- Stadler MB, Murr R, Burger L, Ivanek R, Lienert F, Schöler A, Wirbelauer C, Oakeley EJ, Gaidatzis D, Tiwari VK, Schübeler D: DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature. 2011, 480: 490-495.PubMedGoogle Scholar
- Wiench M, John S, Baek S, Johnson TA, Sung M-H, Escobar T, Simmons CA, Pearce KH, Biddie SC, Sabo PJ, Thurman RE, Stamatoyannopoulos JA, Hager GL: DNA methylation status predicts cell type-specific enhancer activity. EMBO J. 2011, 30: 3028-3039. 10.1038/emboj.2011.210.PubMedPubMed CentralView ArticleGoogle Scholar
- Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, Zhang X, Wang L, Issner R, Coyne M, Ku M, Durham T, Kellis M, Bernstein BE: Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011, 473: 43-49. 10.1038/nature09906.PubMedPubMed CentralView ArticleGoogle Scholar
- Branco MR, Ficz G, Reik W: Uncovering the role of 5-hydroxymethylcytosine in the epigenome. Nat Rev Genet. 2011, 13: 7-13.PubMedView ArticleGoogle Scholar
- Yu M, Hon GC, Szulwach KE, Song C-X, Zhang L, Kim A, Li X, Dai Q, Shen Y, Park B, Min J-H, Jin P, Ren B, He C: Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell. 2012, 149: 1368-1380. 10.1016/j.cell.2012.04.027.PubMedPubMed CentralView ArticleGoogle Scholar
- Huang Y, Pastor WA, Shen Y, Tahiliani M, Liu DR, Rao A: The behaviour of 5-hydroxymethylcytosine in bisulfite sequencing. PloS One. 2010, 5: e8888-10.1371/journal.pone.0008888.PubMedPubMed CentralView ArticleGoogle Scholar
- Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JYH, Zhang J: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004, 5: r80-10.1186/gb-2004-5-10-r80.PubMedPubMed CentralView ArticleGoogle Scholar
- Adams D, Altucci L, Antonarakis SE, Ballesteros J, Beck S, Bird A, Bock C, Boehm B, Campo E, Caricasole A, Dahl F, Dermitzakis ET, Enver T, Esteller M, Estivill X, Ferguson-Smith A, Fitzgibbon J, Flicek P, Giehl C, Graf T, Grosveld F, Guigo R, Gut I, Helin K, Jarvius J, Küppers R, Lehrach H, Lengauer T, Lernmark Å, Leslie D, et al: BLUEPRINT to decode the epigenetic signature written in blood. Nat Biotechnol. 2012, 30: 224-226. 10.1038/nbt.2153.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons AttributionLicense (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use,distribution, and reproduction in any medium, provided the original work is properly cited.