Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data

Fig. 1

Biased degradation of unmethylated C-rich DNA after bisulfite treatment. a Post-bisulfite recovery of C-rich and C-poor DNA fragments treated with different BS conversion protocols. Fragment sequences originate from the M13 phage sequence (Additional file 2: Table S1). Statistical analysis was performed with a two-way ANOVA, p = 0.0034 for cytosine content (‘Heat’ and ‘Alkaline’ only) and p < 0.0001 for the method. b Asymmetric C-rich (23–24% C) and C-poor (12–14% C) strand representation in the mouse major satellite repeat and mtDNA in amplification-free PBAT datasets. The total read count per strand is represented as a proportion of 100%. c Telomere repeat count per read in amplification-free PBAT and non-BS converted NGS control. To assess the fragmentation rate of the C-strand with increase of tandem count and cytosine content, reads containing G-strand tandems ([TTAGGG]n) were quantified separately from unconverted and BS converted (in PBAT) C-strand tandems ([CCCTAA]n and [TTTTAA]n, respectively). Each plot represents a single dataset. C-strand reads containing less than four to six tandems are genuine [TTTTAA]n repeats of non-telomere origin (results not shown; see “Methods”). d Post-bisulfite recovery of unmethylated, fully methylated and hydroxylated C-rich and C-poor DNA fragments treated with different BS conversion protocols. One-way ANOVA was performed with Tukey’s multiple comparisons test. e LC-MS measurement of total genomic 5mC levels in mouse ESC gDNA before (WT mESC) and after treatment with different BS-conversion protocols (Alkaline, Heat, Am-BS). 5mC is represented as a percentage of all cytosines; the converted cytosines were measured as uracils in the BS converted samples. Individual two-tailed paired t-tests were performed within matched sample–control pairs. Details on number of WGBS datasets used for each analysis are presented in Additional file 1. *p < 0.05, **p < 0.01, ***p<0.001, ****p<0.0001; error bars represent standard deviation

Back to article page