Genome-wide mapping of 5-hydroxymethyluracil in the eukaryote parasite Leishmania

Background 5-Hydroxymethyluracil (5hmU) is a thymine base modification found in the genomes of a diverse range of organisms. To explore the functional importance of 5hmU, we develop a method for the genome-wide mapping of 5hmU-modified loci based on a chemical tagging strategy for the hydroxymethyl group. Results We apply the method to generate genome-wide maps of 5hmU in the parasitic protozoan Leishmania sp. In this genus, another thymine modification, 5-(β-glucopyranosyl) hydroxymethyluracil (base J), plays a key role during transcription. To elucidate the relationship between 5hmU and base J, we also map base J loci by introducing a chemical tagging strategy for the glucopyranoside residue. Observed 5hmU peaks are highly consistent among technical replicates, confirming the robustness of the method. 5hmU is enriched in strand switch regions, telomeric regions, and intergenic regions. Over 90% of 5hmU-enriched loci overlapped with base J-enriched loci, which occurs mostly within strand switch regions. We also identify loci comprising 5hmU but not base J, which are enriched with motifs consisting of a stretch of thymine bases. Conclusions By chemically detecting 5hmU we present a method to provide a genome-wide map of this modification, which will help address the emerging interest in the role of 5hmU. This method will also be applicable to other organisms bearing 5hmU. Electronic supplementary material The online version of this article (doi:10.1186/s13059-017-1150-1) contains supplementary material, which is available to authorized users.


Fig. S5 | Assessment of experimental biases.
To assess potential artifacts and biases in the chemical 5hmU enrichment process, we carried out a model study using chemically synthesized randomized ODNs. We considered that potential artifacts may arise from the oxidative damage to thymine (to either 5hmU or 5fU) in any process before enrichment (genomic DNA extraction, DNA fragmentation, adapter ligation, oxidation, and biotinylation), and also that the sequence bias may potentially arise from PCR or background binding of DNA to the streptavidin beads. The model ODNs do not bear 5hmU and 5fU, therefore are suitable to study potential oxidative damage during DNA treatment and background artifact signals. (a) The chemically synthesized model ODN sequence, where N means a randomly incorporated A, T, G or C. The average GC content of the randomized region was adjusted to 64% during the template DNA synthesis (equal to the GC content of L. major genome). (b) The experimental workflow. The double-stranded input DNA was subjected to either standard library prep or "mock" DNA extraction, where the input DNA purified by a genomic DNA extraction kit using DNeasy Blood & Tissue Kit (Quagen, followed protocol provided for cultured cells) for cultured cell samples. The extracted DNA was then subjected to sonication using the same protocol used to fragment L. major DNA ("mock" fragmentation). Although the input DNA was naked DNA and also too short to be further fragmented by sonication, it was necessary to carry out the two "mock" processes in order to assess oxidative damages during the actual workflow. The DNA was then adapter ligated, oxidized, and biotinylated, followed by enrichment. To exclude the possibility that enrichment was not successful, two spike-in controls were added to the library after adapter ligation and used to estimate the 5hmU enrichment. After sequencing, duplicated reads that arose due to PCR biases were removed, and sequence profiles in the randomized region were analyzed by fastqc. (c) ATGC contents of the randomized regions in input (left) and 5hmU pull down samples (middle and right). We did not observe any increase of AT contents in 5hmU pull down samples, which would have been observed in case of oxidative damage during DNA treatment. We also did not observe any specific sequence motif (in five bases context) enriched in 5hmU pull down samples, suggesting our 5hmU enrichment procedure does not induce any noticeable sequence bias.   a) The motif p-value times the number of candidate motifs tested.

General reagents and equipment
Ultrapure water produced by Synergy® UV Remote Water Purification System (Merck Millipore) or PCR grade water (Roche Life Science) was used to prepare all aqueous solutions unless specified. DNA LoBind Tubes (Eppendorf) were used throughout the procedure. Incubation and PCR was carried out using T100 Thermal Cycler (BioRad). qPCR experiments were carried out with CFX96 Real-Time System (BioRad). Sonication of DNA was carried out with M220 Focused-ultrasonicator™ (Covaris). All sequencing experiments were carried out on a Mi-Seq using Miseq reagent kit v3 (Illumina).

Buffer compositions used in the study
Binding buffer Table 2 Biotinylated ODNs bearing 5hmU, U, 5fU, 5mC, 5hmC as well as a control ODNs without any nucleobase modification were synthesized by PCR. Each ODN template (10 nM) was amplified in the presence of indicated primers (1 µM each) and dNTPs (0.2 mM each) using Dreamtaq DNA polymerase (1.25 unit, Life Technologies). 5hmUTP, 5fUTP, or dUTP were used instead of TTP for 5hmU, 5fU, U modified ODNs. 5hmdCTP and 5mdCTP was used instead of dCTP for 5hmC, 5mC modified ODNs. Incorporation of each nucleobase modification to ODNs was confirmed by HPLC.
After the incubation, the mixture was passed through mini quick spin oligo columns (Roche Life Science). The oxidised DNA solution (ca. 25 µL), 100 mM phosphate buffer (pH 6, 10 µL), 17 mM (+)-biotinamidohexanoic acid hydrazide (15 µL, Sigma Aldrich) were mixed and incubated at 40 °C for 2 h. The reaction mixture was passed through mini quick spin oligo columns (Roche Life Science) or Micro Bio-Spin columns with Bio-Gel P-6 in SSC Buffer (P6 column, Bio Rad), which were pre-washed with water three times prior to use. The biotinylated DNA solution was mixed with the same amount of elution buffers (x 2) as indicated in Fig. 1c and incubated at 40 °C for 1h. The reaction mixture was passed through a mini quick spin oligo column or P6 column, which was pre-washed with water three times prior to use.

LC-MS analysis of oligo DNA models
LC-ESI MS spectra were recorded on an AmaZon X ESI-MS (Bruker) connected to an Ultimate 3000 HPLC (Dionex). Analyses of ODNs were carried out by LC-MS using XBridge BEH C18 XP (30Å, 2.5 µm, 3 mm x 50 mm, Waters) or XTerra MS C18 Column (125Å, 2.5 µm, 2.1 mm x 50 mm, Waters). The elution solvents were 10 mM triethylamine and 100 mM hexafluoroisopropanol in water (solvent A) and methanol (solvent B). The flow rate was 0.2 ml/min. After equilibration at 5% B, a liner gradient of 2 min at 5% B, 20 min at 5-30% B, 3 min at 30% B was used as the elution condition.

2-4. qPCR analysis to assess enrichment efficacy of chemical pull down (relevant to Fig. S1)
A solution (25 µL) of ODN2-5fU, ODN3-T (100 pg for each ODNs) and salmon sperm DNA (1µg, Life Technologies) was subjected to the biotinylation as described above, followed by enrichment as described for the Chemical 5hmU enrichment sequencing. The recovered ODN2-5fU and ODN3-T were quantified with Brilliant III Ultra-Fast SYBR Green qPCR Master Mix (5 µL) (Life Technologies). Enrichment was calculated as (recovery of ODN2-5fU)/(recovery of ODN3-T) assessed by qPCR quantification

2-5. Validation of 5hmU specificity of ab19735 by ELISA (relevant to Fig. S3)
Streptavidin coated high sensitivity plates (Life Technologies) were washed with ELISA buffer (3 x 200 uL, see section 2-1 for the buffer composition) and incubated with 5'-biotinylated ODN1 (10 ng/µL, 50 µL per well) at 4 °C for overnight, followed by washing 3 times with ELISA buffer (200 uL). The plate was incubated with urea (5 M in water) at 40 °C for 30 min, followed by quick washing 3 times with ELISA buffer (200 uL). Experiments using double-stranded DNAs were carried out without this urea treatment step. The plate was pre-blocked with ELISA buffer (200 uL) at room temperature for 3 h, followed by washing 3 times with ELISA buffer (200 uL). The plate was incubated with antibody ab19735 (Abcam, RRID:AB_722498) at rt for 1 h in dilutions indicated in Fig.  S3ab, followed by wash 3 times with ELISA buffer (200 uL). The plate was then incubated with protein G-HRP (1:5,000 dilution, Life Technologies) at rt for 1 h in dilutions indicated in Fig. S4ab, followed by washing 3 times with ELISA buffer (200 uL). The bound protein G-HRP was detected by addition of the substrate BM Blue POD Substrate (Roche Life Science) followed by the addition of 1M H 2 SO 4 (50 uL). The absorbance at 450 nm was measured on a a SPECTROStar Nano microplate reader (BMG LABTECH). Fig. S2)
After incubation at 37 ºC for 4 h, the solution was filtered with Amicon Ultra-0.5 mL 10 K centrifugal filters (Merck Millipore) and subjected to LC-MS 2 analysis on Q Exactive™ Hybrid Quadrupole-Orbitrap Mass Spectrometer (Thermo Scientific) equipped with a nano spray ionization source, coupled to an Ultimate RSLCnano LC system (Dionex). A capillary column (hypercarb, 5 cm x 75um ID, 3 um particle size, self packed) was used and water -acetonitrile containing 0.1% formic acid was the solvent system with 1.5 µL/min flow rate, and gradient of 2% (0-5 minutes); 2-40% (5-12 minutes); 40-95% (12-15) acetonitrile by a 3 minute equilibration step. The quadrupole was pre-set to isolate the precursor-ions of all analytes ([M+H] + ±2 m/z). Settings were as follows: spray voltage: 2300 V, capillary temperature: 250 º C, resolution 35.000, NCE: 10.0, AGC target: 1e 6 , 100 ms maximum injection time. For the quantitation, calibration curves were generated by recording data for a dilution series of the analytes (C 10000-30 nM, 5hmU 20000-0.63 nM, 5fU 1000-0.63 nM and base J 2000-2.5 nM) using stable isotope labeled 15 N 3 C, 13 CD 2 5hmU and 13 CD5fU as internal standards at a 20 nM concentration. Base J was quantitated using external calibration of the calibration line. The limits of quantitation were 0.37 nM, 0.67 nM and 1.25 nM for 5hmU, 5fU and base J respectively. All data was processed using the Thermo Xcalibur quan browser software.

2-7. 5hmU-DIP (relevant to Fig. S3)
Fragmented DNA samples (200 ng) and a solution of spike-in control ODNs (100 pg for each, see Table S2) were mixed and used for adapter ligation using the standard protocol and reagents provided in NEBNext® Ultra™ DNA Library Prep Kit for Illumina® (New England BioLabs) and sequence adapters provided in TruSeq DNA Sample Preparation Kit v2 (Illumina). The adapter ligated DNA samples were purified by AMPure® XP Beads (Beckman Coulter) and eluted with 25 µL of water. For generating 5hmU-enriched library by DIP, the adapter ligated DNA (20 µL) was heated to 95 °C for 10 min and then immediately cooled on ice to denature. To the solution, binding buffer 2 (2x, 35 µL, see section 2-1 for the buffer composition), anti-5-hydroxymethyluridine antibody ab19735 (5 µL, Abcam, RRID:AB_722498), and rabbit anti-goat IgG H&L ab6697 (10 µL, Abcam, RRID:AB_955988) were added. The mixture was incubated with rotation at 4 °C overnight. The solution was added to Dynabeads® Protein G (Life Technologies) which had been pre-washed with citrate−phosphate buffer (pH 5.5) and binding buffer (x1,