DNA binding specificities of the long zinc-finger recombination protein PRDM9
- Timothy Billings1,
- Emil D Parvanov2,
- Christopher L Baker1,
- Michael Walker1,
- Kenneth Paigen†1Email author and
- Petko M Petkov†1Email author
© Billings et al.; licensee BioMed Central Ltd. 2013
Received: 25 February 2013
Accepted: 24 April 2013
Published: 24 April 2013
Meiotic recombination ensures proper segregation of homologous chromosomes and creates genetic variation. In many organisms, recombination occurs at limited sites, termed 'hotspots', whose positions in mammals are determined by PR domain member 9 (PRDM9), a long-array zinc-finger and chromatin-modifier protein. Determining the rules governing the DNA binding of PRDM9 is a major issue in understanding how it functions.
Mouse PRDM9 protein variants bind to hotspot DNA sequences in a manner that is specific for both PRDM9 and DNA haplotypes, and that in vitro binding parallels its in vivo biological activity. Examining four hotspots, three activated by Prdm9 Cst and one activated by Prdm9 Dom2 , we found that all binding sites required the full array of 11 or 12 contiguous fingers, depending on the allele, and that there was little sequence similarity between the binding sites of the three Prdm9 Cst activated hotspots. The binding specificity of each position in the Hlx1 binding site, activated by Prdm9 Cst , was tested by mutating each nucleotide to its three alternatives. The 31 positions along the binding site varied considerably in the ability of alternative bases to support binding, which also implicates a role for additional binding to the DNA phosphate backbone.
These results, which provide the first detailed mapping of PRDM9 binding to DNA and, to our knowledge, the most detailed analysis yet of DNA binding by a long zinc-finger array, make clear that the binding specificities of PRDM9, and possibly other long-array zinc-finger proteins, are unusually complex.
Keywordsrecombination hotspots PRDM9 DNA binding EMSA zinc-finger proteins
Genetic recombination is an essential feature of meiosis, assuring an appropriate segregation of chromatids at the first meiotic division, and generating an evolutionarily important source of genetic variation by providing new arrangements of alleles between genes linked on the same chromosome. In many organisms, notably yeast , higher plants , and mammals including humans and mice [3–5], recombination is concentrated along chromosomes at limited sites known as 'hotspots'. Typically a kilobase in extent, hotspots are surrounded by long stretches of DNA, tens to hundreds of kilobases in extent, that are essentially devoid of recombination in humans and mice.
Recently, several groups have shown that PR domain member 9 (PRDM9), a zinc-finger (ZF) protein with histone 3 lysine 4 (H3K4) methyltransferase activity, plays a key role in determining the locations of hotspots in both mice and humans [6–8]. It is presently proposed that PRDM9 binds to appropriate DNA sequences in meiotic chromatids, generates activated chromatin by virtue of its H3K4 methyltransferase activity, and somehow guides the generation of double-strand breaks (DSBs) at those sites by the topoisomerase-like protein SPO11 . Analyses of individual human hotspots and a number of genome-wide studies have implicated PRDM9 as the predominant regulator of hotspot placement [6, 7, 10, 11]. In mice, analyses of genome-wide hotspots of DSB formation  make it clear that PRDM9 determines the location of virtually all hotspots, with the clear exception of the obligate crossover at the pseudoautosomal region, at which recombination occurs equally well with different variants of PRDM9 present, or indeed no variant at all. There is also evidence that Prdm9 participates in transcriptional regulation [13, 14], which may be related to its involvement in hybrid sterility .
Although PRDM9 plays a very important role in mammalian recombination, there is considerable uncertainty as to how it physically determines hotspot locations and then directs DSB formation there rather than at other trimethylated H3K4 (H3K4-me3) sites, of which there are many. Meeting these challenges has repercussions both for expanding the current understanding of recombination biology and for the insights this could provide in understanding the functions of other regulatory proteins with ZF arrays. More than 4% of human protein-coding genes contain ZF arrays, and half of those arrays are comparable in size with those present in PRDM9 [15–17].
The identification of PRDM9 as a regulator of human recombination  relied on the finding that the DNA sequence predicted to bind to the ZF array of the most common human variant (variant A) matched a 13 bp consensus sequence that characterizes 41% of human hotspots , and the fact that although this sequence is present in chimpanzee DNA, it does not characterize chimpanzee hotspots. Baudat et al.  correlated human hotspot activity with human allelic variation at PRDM9, and predicted a mouse PRDM9 DNA binding sequence found in a mouse hotspot. Parvanov et al.  identified Prdm9 in mice by genetically mapping the gene controlling hotspot activity to a 181 Kb interval containing four genes, three of which could be excluded as candidates.
Although PRDM9 reportedly binds to hotspot DNA [6, 19], there is still considerable confusion about the nature of the DNA sequences recognized by the ZF array of PRDM9 and how this protein achieves its locational specificity. Many more copies of the human consensus sequence are found in the genome at non-hotspot sites than at the hotspots themselves , and Berg et al.  showed that human hotspots possessing or not possessing this consensus sequence are equally dependent on PRDM9. In mice, the consensus sequence originally predicted for the mouse PRDM9Cst variant  is more commonly present in non-hotspot than in hotspot regions, and the ZF prediction programs used in various studies [20–22] also predict that the mouse Dom2 variant of PRDM9 (PRDM9Dom2) should bind to hotspots that genetic studies have shown are not activated by this allele. Nevertheless, when tested experimentally, longer oligonucleotides containing buried sequences matching the predicted recognition motifs of two human PRDM9 alleles have been shown to bind PRDM9 protein expressed in cell cultures [6, 19].
To address the DNA binding problem experimentally, we expressed the mouse Prdm9 Cst allele (from the CAST/EiJ strain; referred to below as CAST) and the Prdm9 Dom2 allele (from the C57BL/6J strain; referred to below as B6) in Escherichia coli. and determined the abilities of their respective protein products to bind DNA sequences from three mouse hotspots (Hlx1, Esrrg1, and Psmb9) known from genetic evidence to be activated by Pdrm9 Cst and one (Pbx1) known to be activated by Prdm9 Dom2 . Binding between expressed PRDM9 and DNA was tested both by the variant-specific ability of PRDM9 to modify the electrophoretic mobility of target DNA sequences and conversely, by the ability of DNA binding-site sequences to physically sequester PRDM9.
Histone H3K4 trimethylating activity
As an indication that the PRDM9 expressed in E. coli is native protein, we first confirmed that it retains its histone trimethylating activity. Both the induced PRDM9Dom2 and the PRDM9Cst E. coli extracts showed enhanced H3K4 trimethylating activity compared with uninduced extracts, and induced empty vector (see Additional file 1, Figure S1).
Fine mapping of Prdm9Dom2 and PRDM9Cst binding sites
The Hlx1 and Esrrg-1 hotspots, described in our previous work [8, 23], require activation by the Prdm9 Cst allele, as does Psmb9 . We concluded that the Pbx1 hotspot is dependent on the Prdm9 Dom2 allele because it was only active in crosses involving B6 and not active in crosses lacking this genetic background (see Additional file 2, Figure S2A).
The B6 and CAST sequences at the Hlx1 binding site differ at three positions (Figure 2D; see Additional file 3, Figure S3B), of which two out of three affect binding affinity and crossover rate in parallel at this hotspot. Although both the B6 and CAST sequences bound PRDM9Cst, binding by the B6 sequence was about 2.9 times stronger than that by the CAST sequence (Figure 2C, right panel). This corresponds with our previous demonstration that initiation of meiotic recombination at Hlx1 is about 2.5 times more frequent on the B6 chromosome than on the CAST chromosome . A recent estimate of Hlx1 by another method found about a four-fold difference between the B6 and CAST sequences using 41 bp oligonucleotides .
The Psmb9 hotspot has been identified and genetically characterized previously . We tested only its central region, where the PRDM9Cst binding reportedly occurs  (Figure 3D), confirming this prediction. Testing the reduced oligos, we found a 30 bp minimal binding site showing allele specificity (Figure 3F). Its best alignment to the predicted site had eight strong matches (Figure 3G), which is likely to be significant (P = 0.00098).
Comparison of PRDM9Cst binding sites
The lengths of the three binding sites (30 to 33 bp) suggests that binding involves all of the Zn fingers in the array, with the possible exception of either the first or the last finger, noting that the first finger has an ECH2 configuration rather than C2H2.
There is very little similarity between the sequences of the three PRDM9Cst binding sites. Any pairwise matches in sequence for the three sites were distributed over the entire length of the binding sites. The best alignment of the three minimal binding sites for the PRDM9Cst ZF array identified only four conserved positions (P = 0.052) (see Additional file 5, Figure S5, red); two would have been expected by chance. These triple matches are located at positions predicted to bind the second, fourth, sixth, and eighth fingers of PRDM9Cst. Pairwise comparisons identified 15 matches between Hlx1 and Esrrg-1 (see Additional file 5, Figure S5, red and green), 12 matches between Hlx1 and Psmb9 (see Additional file 5, Figure S5, red and yellow), and only 6 matches between Psmb9 and Esrrg-1 (see Additional file 5, Figure S5, red and cyan). We examined this level of similarity using two statistical methods. Using the χ2 test, the probability that the number of nucleotide matches between the three sites (scored as 0, 2, or 3 matches at each position) exceeds chance was 0.048 (barely significant), and this significance entirely disappeared when we considered only the 17 positions where the mutation analysis (see below) indicated the strongest evidence for nucleotide specificity. Additionally, using the binomial distribution for pairwise comparisons, we found some support for similarity between Hlx1 and Esrrg-1 (P = 0.003) and between Hlx1 and Psmb9 (P = 0.048), but not between Psmb9 and Esrrg-1 (P = 0.75). Given this marginal sequence similarity between the three PRDM9Cst binding sites, we confirmed by testing their ability to compete with each other for binding that they do in fact bind to the same molecular entity. The sites do compete with each other, and the data suggest that Hlx1 and Psmb9 both bind more strongly than Esrrg-1 (Figure 2E; Figure 3B, E). There was also no consistency in the binding specificity of the same ZF located at different positions in the array. The amino acids at positions -1, 2, 3, and 6 in each finger are thought to make contact with the DNA helix and determine binding specificity [21, 22, 25]. These amino acids are identical (ASNQ), as are all the rest of the amino acids in fingers 2, 5, 7, and 9 of the PRDM9Cst Zn finger array, and all 11 fingers in this array contain serine at the -2 position, which is also thought to contribute to specificity. Nevertheless, fpr the three PRDM9Cst binding sites, there is no consistency in the four triplets that these four fingers bind (see Additional file 5, Figure S5). A similar result was seen for the nucleotides binding the pairs of identical fingers in the PRDM9Dom2 variant at the Pbx1 hotspot.
In vivo H3K4 trimethylation sites
The algorithms developed by Persikov and Singh  are commonly used to predict ZF DNA binding sequences. These include both a linear prediction program based on an elemental identity of trinucleotide binding sequences for each finger, and a polynomial form of the program that takes nucleotide interactions into account. In every case, when both programs were tested against 200 bp sequences surrounding the four identified binding sites, the programs identified multiple binding sites within the 200 bp sequence. In only two of the eight tests was the actual binding site the top scoring site; in two cases, the program failed to identify the binding site at all, and the quality of prediction declined noticeably when tested against longer DNA sequences. The performance of the linear and polynomial prediction programs differed for each hotspot. Only the polynomial program correctly identified the Hlx1 binding site, whereas only the linear program identified Pbx1. Both programs identified Esrrg-1 and Psmb9 (see Additional file 6, Table S1).
As an alternative, we used a position-weighted matrix  derived from the detailed binding requirements of the Hlx1 hotspot (see Additional file 7, Table S2) to scan for DNA binding sites that coincide with genetically determined hotspots on mouse chromosomes 1 and 11 [5, 28]. Unfortunately, we failed to find a DNA binding motif common to genetically identified hotspots on these chromosomes.
Mutational characterization of the Hlx1 binding site
The 31 positions of the Hlx1 site varied greatly in their specificity. As a simple measure of the degree of specificity, we calculated the standard deviation (SD) of the measured binding affinities at each position (see Additional file 9, Table S3). Using this indicator of specificity, the positions fell into three groups: 13 of low specificity (SD = 0.03 to 0.12), 10 of moderate specificity (SD = 0.16 to 0.36), and 8 with the highest specificity (SD = 0.45 to 0.52), located at fingers 4, 5, and 6 near the center of the binding site. The weakest specificity was at the C-terminal end, fingers 9 to 11, corresponding to positions 25 to 33 of the binding site. Given that these fingers are nevertheless required for binding, it is possible that they function in a non-sequence-specific manner, for example by stabilizing the contact between PRDM9 and DNA through charge interactions involving the phosphate backbone. In this regard, it is interesting that these fingers seem to be less subject to evolutionary selection; the last two fingers are invariant between mouse alleles, and the next has only a single amino acid substitution, N for V. Replacement of nucleotides 1 to 2 and 34 to 36, adjacent to the shortest 31 bp sequence (see Additional file 8, Figure S6C), also affected binding affinity, lending further support to the possibility of more complex interactions between DNA and ZFs.
Our data suggesting that binding might involve mechanisms other than interaction with nucleotides in the major groove of DNA, as presently assumed , prompted us to test the effect of Mg2+ ions, which are well known to interact with polyphosphates. We found that binding is strongly inhibited at concentrations above 1 mmol/l and nearly completely inhibited by 10 mmol/l (see Additional file 10, Figure S7). This inhibition was essentially identical for both the PRDM9Dom2 and PRDM9Cst variants at all four hotspots tested. The possible involvement of the phosphate backbone of DNA in binding to PRDM9 seems to be a particular characteristic of PRDM9 rather than a general characteristic of ZF proteins, as it is notably different from the effects of Mg2+ on DNA binding by the ZF proteins WT1 and EGR1 ' for those proteins, millimolar concentrations of Mg2+ activated binding, and higher concentrations had little inhibitory effect .
Taken together, our results confirm that the binding seen with PRDM9 expressed in E. coli correctly recapitulates the biological specificities of PRDM9, including its allelic specificity, the haplotype specificity of its target, and the location of the binding sites near the peaks of hotspot H3K4 trimethylation detected in vivo. Moreover, the correlation between the binding affinities of different Hlx1 haplotypes and their genetically measured rates of crossover also suggest that the strength of the physical binding of PRDM9 to hotspot sequences may determine the efficiency of recombination initiation. There was little sequence similarity between the several binding sites for the same PRDM9 variant, and the binding specificities of individual fingers appeared to be context-dependent. Mutational analysis of the Hlx1 binding site indicates considerable variation between the ZFs in their contribution to overall binding affinity and the probable involvement of the DNA phosphate backbone.
These findings have relevance to the general role of ZF arrays as a DNA-recognition motif in biology, given that ZF proteins are the most common DNA regulatory protein in vertebrate genomes, comprising over 4% of all protein-coding genes in humans, and that half of these contain 10 or more fingers. They are almost as common in invertebrates, in which they were first described . The issues surrounding the DNA binding specificities of ZFs are both biological and chemical. Biologically, we want to derive a consensus binding sequence whose parameters predict the location and relative binding affinity of genomic binding sites, and then describe how these affinities relate to biological outcomes. Chemically, we want to understand how the specificity of binding sequences is determined by the atomic and molecular interactions between protein and DNA.
PRDM9 provides a particularly useful system for addressing these issues. It is highly variable, with multiple variants available both within and between species. It is a multiply fingered protein, providing the opportunity to examine interactions between repeat fingers in the same protein. In addition, there are many thousands of binding sites in genomes, providing a considerable source of experimental material. To our knowledge, the data we now report provide the first detailed definition of the DNA binding specificities of a ZF protein with a long (> 10) array of contiguous fingers. What has emerged is a picture of considerable complexity, one that raises as many or more questions than it answers.
It was somewhat surprising to find that PRDM9 binding sites are not uniformly positioned in relation to hotspot centers. One possible explanation for this asymmetry could be the presence of crossover refractory zones created by the nature of adjoining DNA sequences that create directionality in the spreading of the Holliday junctions away from initiation sites . However, our genetic data point to a more likely alternative. The sequence of the Pbx1 hotspot is identical in B6xCAST and B6xB6. CAST-1T crosses, but the crossovers in the former cross are resolved on both sides of the binding site, whereas the eight-fold larger number of crossovers in the latter cross are almost exclusively on one side of the binding site. In the absence of possible cis effects, this suggests a role for additional trans-acting factors in affecting the directional processing of recombination intermediates, with attendant consequences for overall crossover rates at this hotspot.
Extent of ZF usage
A notable feature of PRDM9 is the required use of all, or nearly all, of its contiguous fingers for binding DNA, as evidenced by the requirement for 35 to 37 bp for PRDM9Dom2, which has 12 fingers, and 30 to 33 bp+ for any binding activity by PRDM9Cst, which has 11 fingers (the plus indicating that binding is further enhanced at the Hlx1 hotspot by an additional 3 bp, one of which shows considerable nucleotide specificity). This long binding array presents a topological issue, as it seems that at least five to six fingers make base pair-specific contacts with DNA. Moreover, detailed analysis of the Hlx1 binding site revealed that multiple nucleotide positions between positions 5 and 21 show appreciable nucleotide specificity, without any gaps longer than 2 bp. If, as in the case of other ZF proteins, these specific contacts form in the major groove of DNA, it suggests that the ZF domain of PRDM9 can bind continuously along the major groove for more than one turn of the DNA double helix. In the context of intact cellular DNA, this would require a snake-like winding of the DNA around its target, which is in marked contrast to the accepted rule that ZF proteins do not use more than three contiguous fingers to bind DNA , and thus requires further explication. Given the further observation that binding is inhibited by magnesium, which complexes with polyphosphates, the requirement for additional fingers with low sequence specificity may indicate the importance of non-specific charge interactions between negatively charged phosphate groups and the positively charged Zn2+ and histidine residues.
The three DNA binding sites identified for the PRDM9Cst variant bear little ostensible resemblance to each other, with little identity between the nucleotide positions along the entire length of these sequences. However, a more subtle relationship between the three sequences can be detected when they are compared with the mutation-analysis data for Hlx1. This suggests that both the strongest binding within an Hlx1 site and most of the sequence matches between the three binding sites occur over the nucleotides binding to the first six fingers of the PRDM9Cst ZF array. The binding rules are obviously complex, and this complexity was emphasized when we examined the Hlx1 hotspot in detail; the mutation analysis showed variable stringency along its length for base-pair identities. At every position, two, three, and often all four bases could serve, albeit not always equally well.
Context dependence of ZF specificity
In addition to the weak similarity between binding sequences, there was also a strong context dependence of DNA binding, with the binding properties of a finger being dependent on its location in the array. Although the second, fifth, seventh, and ninth ZFs of PRDM9Cst have identical amino acid sequences, they did not bind related trinucleotide sequences either within or between hotspots (see Additional file 5, Figure S5).
Previous three-dimensional studies of ZF DNA interactions suggest a model in which four amino acids in each finger (positions -1, +2, +3, and +6) bind four consecutive base pairs in DNA. The last of the four bases shares its binding with the next finger, giving a repeating pattern of 3 bp per finger (plus 1 bp for the total possible length of the binding site) . Computer analyses of multiple 3D structures indicate that all four amino acids are equally important ; however, the evolutionary data suggest otherwise for PRDM9. The amino acid composition of its ZFs undergoes rapid evolutionary selection to generate new hotspots as existing hotspots undergo continual loss by mutation . Notably, position +2 is essentially invariant in the face of extraordinarily high replacement rates at positions -1, +3, and +6 [8, 10, 32, 33], suggesting that, at least for PRDM9, this amino acid plays a lesser role in determining DNA binding specificity.
Our findings introduce considerable complexity into efforts to understand DNA binding by long-array ZF proteins such as PRDM9. The binding sites for the PRDM9Cst variant are only very subtly related; identical ZFs show little constancy in the trinucleotides they bind, and a position weight matrix approach  failed in a search for a DNA binding motif common to genetically identified hotspots on mouse chromosomes 1 and 11 [5, 28]. Perhaps most surprising is the finding that binding requires DNA sequences involving more than one helical turn, even though the terminal nucleotide positions show little sequence specificity. To our knowledge, PRDM9 is the first long-array ZF protein to have its DNA binding specificity at separate binding sites compared in detail. At issue now are the biological question of how to solve the binding rules for PRDM9 and the chemical question of whether the binding complexities of PRDM9 are shared by the hundreds of other long-array ZF proteins that carry out a great diversity of biological functions.
Materials and methods
Cloning and expression of mouse Prdm9
Full-length Prdm9 was amplified from cDNA prepared from the testes of CAST/EiJ (CAST) and C57BL/6J (B6) mouse strains using primers Prdm9F-HisB and Prdm9R-HisB (for sequences of primers used for cloning and sequencing, see Additional file 11) and cloned into pBAD/HisB (Invitrogen Corp., Carlsbad, CA, USA) using XhoI and HindIII restriction sites added to the 5'-ends of the primers. The identity of the cloned products was confirmed by sequencing (see Additional file 12). The plasmids were transformed into TOP10 E. coli cells, allowing arabinose induction of protein synthesis. For production of PRDM9 protein, the cells were grown overnight in Luria broth and then incubated an additional 4 hours with the addition of 0.02% arabinose. Total cell lysate was prepared in native binding buffer (50 mmol/l sodium phosphate pH 8 and 500 mmol/l NaCl) in accordance with the manufacturer's specifications, and stored at -80°C until use, when it was heated at 50°C for 15 minutes to inactivate bacterial DNAses. The production of full-length protein was confirmed by western blotting using an antibody to the N-terminal his tag. Crude bacterial lysates used in the in vitro binding assays contained 15 to 30 μg/ml expressed PRDM9 as estimated by ELISA.
Histone methylation assay
Crude bacterial extract containing induced/non-induced PRDM9 variants or empty pBAD/HisB vector were heated for 15 minutes at 50°C, then used for a histone methyltransferase assay. The reaction mixture contained 20 μl 2× HMT buffer (50 mmol/l Tris-HCl pH 8 and 20% glycerol), 2 μl S-adenosinmethionine (100 μg/ul; Sigma-Aldrich, St Louis, MO, USA), 10 μl bacterial extract, 0.2 μl histone substrate (0.5 μg/μl recombinant H3K4-me2; Active Motif, Carlsbad, CA, USA), and 7.8 μl water. The reaction was kept at 34°C for 30 minutes.
Several commercially available antibodies were used to detect specific H3K4me3 signals. All tested antibodies showed cross-reactivity with H3K4me2. Therefore, we estimated the level of H3K4 trimethylation by induced PRDM9 relative to controls with induced empty vector. The results presented here were obtained using the antibody that showed the most consistent discrimination between H3K4me2 and H3K4me3 signals (recombinant H3K4-me2; Active Motif).
where FH3K4me3 is the fraction of the specific signal, Sind and Sun are the signals of induced and uninduced extracts, and SH3K4me3 and SH3K4me2 are the signals of commercial H3K4me3 and H3K4me2, respectively.
Electrophoretic mobility shift assay
Using an electrophoretic mobility shift assay (EMSA) (LightShift Chemiluminescent EMSA Kit; Pierce Biotechnology ThermoScientific, Rockford, IL, USA), a biotin-labeled test oligo and PRDM9 were allowed to bind in a mixture containing 10 mmol/l Tris (pH 7.5), 50 mmol/l KCl, 1 mmol/l dithiothreitol, 50 ng/μl poly(dI/dC), 0.05% NP-40, 0.5 pmol/l of labeled double-stranded oligo, and 3 μl of bacterial extract in a total volume of 20 μl. When necessary, 10 pmol/l of cold competitor or non-competitor was added. The reaction mixture was incubated for 40 minutes and separated on 5% PAA gel in 0.5× Tris-borate-EDTA (TBE) buffer, which was pre-electrophoresed for 30 minutes. After electrophoresis, the separated products were transferred onto nylon membrane by wet transfer in 0.5× TBE at 380 mA for 1 hour. The membrane was cross-linked by UV light at 120 mJ/cm2 (UV Stratalinker 2400; Agilent Technologies, Santa Clara, CA, USA). Biotin-labeled products were detected by the chemiluminescence signal produced by the binding to streptavidin-peroxidase conjugate in accordance with the manufacturer's specification, and the chemiluminesce signal was recorded (G:BOX system; Syngene, Frederick, MD, USA).
Single-stranded oligos were biotin-labeled at the 3'-end *Biotin 3' End DNA Labeling Kit; Pierce Biotechnology ThermoScientific) in accordance with the manufacturer's specifications. Complementary oligos were then mixed, denatured at 95°C for 2 minutes, and annealed by cooling down to room temperature in a water bath. When a PCR product was labeled, the two strands were separated by heating at 95°C for 5 minutes. The tube was immediately transferred to ice, and then the product was labeled and re-annealed as above.
The bands were quantified using Gene Tools software (Syngene, Frederick, MD USA), using the 'Manual bands' and 'Absorption' settings. Equal-sized rectangles were drawn around each shifted and unshifted band present, and the absorption density was scored automatically by the software. The proportion of shifted band was calculated as S/(S+U), where S is the density of the shifted band and U is the density of the unshifted band.
The DNA binding sites of hotspots were localized by testing whether the electrophoretic mobility of biotin end-labeled DNA sequences was altered after binding to the PRDM9Cst and PRDM9Dom2 proteins expressed in E. coli. All binding experiments were performed using crude E. coli extracts, because every attempt to purify PRDM9 resulted in insolubility and loss of activity. The specificity of binding was confirmed by appropriate controls, including induced cells that had been transformed with empty vector, and the host cells alone. Additionally, the variant-specific binding of PRDM9 to hotspot sequences was confirmed by isolation of protein-biotinylated DNA complexes onto streptavidin beads and detection of PRDM9 by western blotting with antibodies against its C-terminal end (see Additional file 13, Figure S8).
To localize binding sites, a tiling approach was used, in which PCR-amplified fragments of 200 to 400 bp, overlapping by 50 bp, were tested for their ability to bind PRDM9 variants (Figures 1A, 2A, 3A, D; also see Additional file 2, Figure S2B; Additional file 2, Figure S3A; Additional file 2, Figure S4A). Positive fragments were then reduced to less than 100 bp using the same strategy. Because biotin labeling close to the end of a binding-site sequence influences binding, the limiting sequence for each binding site was then determined by comparing the binding ability of progressively shorter, unlabeled oligos against the binding ability of longer, labeled oligos.
where Bmut is the relative change in binding strength of the mutated oligo, and Sun, Smut, and Sscr are the proportions of shifted bands of the unmutated, mutated, and scrambled oligos, respectively.
Streptavidin pull-down and western blotting
Protein-biotinylated DNA complexes were isolated on streptavidin beads. PRDM9 variants were expressed in E. coli and bound to biotinylated oligos representing binding sites, as described above for the EMSA experiments. The complexes were then isolated on T1 streptavidin-coated beads (Dynabeads; Invitrogen Dynal, Oslo, Norway) and washed three times with EMSA binding buffer, then the bound proteins were released from the beads by adding 0.1% SDS. The proteins were separated on 4 to 20% SDS gels and transferred onto polyvinylidene difluoride (PVDF) membrane, and the specificity of PRDM9 binding was detected on western blots using mouse antibodies against the C-terminal part of PRDM9.
Chromatin immunoprecipitation for H3K4-me3
Crude isolation of spermatocytes from B6 and B6xCAST F1 juvenile mice was performed as reported previously  with minor modifications.
Spermatocytes were isolated from testes of 12-dpp juvenile mice. Cross-linking of spermatocytes was performed by addition of formaldehyde to a final concentration of 1% and incubation at room temperature for 10 minutes with constant rotation. Cross-linking was stopped by dropwise addition of glycine to a final concentration of 125 mmol/l, followed by incubation with rotation for 5 minutes at room temperature. Cells were washed twice, separated by centrifugation at 2000 g for 5 minutes at 4°C, and resuspended in 1 ml PBS. After the final wash, hypotonic lysis buffer (10 mmol/l Tris-HCL pH 8.0, 1 mmol/l KCl, 1.5 mmol/l MgCl2) was added at a concentration of 5 × 106 cells/ml supplemented with 1 mmol/l phenylmethanesulfonylfluoride (PMSF) and 1× protease inhibitor cocktail (PIC; Sigma-Aldrich) and incubated for 30 minutes at 4°C with rotation to shear the cellular membrane. Nuclei were pelleted by centrifugation at 10,000 g for 10 minutes and resuspended in MNase buffer (50 mmol/l Tris, 1 mmol/l CaCl, 4 mmol/l MgCl, 4% NP-40) at 26 μl/106 cells, supplemented with 1 mmol/l PMSF and 1× PIC. Chromatin was fragmented and solubilized by addition of MNase (USB, Cleveland, OH, USA) at 15 U per 5 × 106 cells, followed by incubation for 2 minutes at 37°C. Nuclease activity was stopped by addition of EDTA to a final concentration of 10 mmol/l and incubation at 4°C for 5 minutes. Soluble chromatin was clarified by centrifugation at 4°C for 10 minutes at top speed, then the supernatant was transferred to a clean tube and the centrifugation repeated. H3K4me3 antibody (Millipore Corp., Billerica, MA, USA) was prebound to 20 μl protein-G beads (Dynabeads; Invitrogen Corp.) following the manufacturer's protocol. Antibody-bound beads were washed twice with 100 μl of immunoprecipitation (IP) buffer (RIPA 50 mmol/l Tris pH 8.0, 150 mmol/l NaCl, 1.0% NP-40, 0.5% Na deoxycholate, 0.1% SDS supplemented with 50 mg/ml BSA, and 0.5 mg/ml salmon-sperm DNA) and resuspended in a final volume of 75 μl IP buffer with PMSF and 1× PIC. Undiluted chromatin (25 μl; around 106 cell equivalents) was added to the beads, and incubated with rotation at 4°C for 2 hours. The chromatin-bound beads were washed three times with 100 μl IP buffer and then twice with 100 μl TE buffer pH 8.0 before elution with 125 μl elution buffer (1% SDS, 20 mmol/l Tris-HCL pH 8.0, 200 mmol/l NaCl, 5 mmol/l EDTA) supplemented with 50 μg/ml Proteinase K (Sigma-Aldrich). Elution was carried out by incubation at 68°C for 2 hours with vigorous shaking at top speed in a thermal mixer (Thermomixer; Eppendorf, Westbury, NY, USA) to reverse cross-links and digest all proteins. An equal aliquot (25 μl of 'input' chromatin was diluted to a final volume of 125 μl with elution buffer and handled in parallel. DNA was recovered from the beads using magnetic separation, placed into a clean tube, and then purified using commercially available methods following the manufacturer's protocol for PCR clean up (Qiagen Inc., Valencia, CA, USA). Purified DNA was eluted in a total volume of 200 μl 10 mmol/l Tris pH 8.0.
Bioinformatic search for a DNA binding motif
A position weight matrix (PWM) was created (Table S2) based on the mutational analysis of the Hlx1 binding site, and a matrix matching algorithm from the software package Motif Occurrence Detection Suite  was used. The weights obtained for the matrix were optimized by maximizing the score for the detection of the Hlx1 binding site within a scan against the entire Hlx1 hotspot region. Subsequently, the PWM was used in a scan against all hotspots less than 10 Kb in size located on chromosome 1, along with control regions adjacently positioned before and after each hotspot; each control region was sized equivalently to its corresponding hotspot. To minimize the effects of multiple testing, only matches with P-values below 10-5 were considered.
C57BL/6J mouse strain
bovine serum albumin
CAST/EiJ mouse strain
Enzyme-linked immunosorbent assay
electrophoretic mobility shift assay
histone 3, lysine 4
National Center for Biotechnology Information
PR domain member 9
position weight matrix
sodium dodecyl sulfate
We thank Ruth Saxl, Pavlina Ivanova and Mary Ann Handel for critical discussions of the manuscript, Evelyn Sargent for technical help, and Joanne Currer for revising the manuscript. This work was supported by NIGMS R01 grants GM078452 to PMP and GM083408 to KP; P50 grant GM076468 to Gary Churchill; Cancer Core grant CA34196 to The Jackson Laboratory; and South Moravian Programme grant 2SGA2773 with financial contribution from the EC Seventh Framework Programme grant agreement 229603 to EDP.
- Petes TD: Meiotic recombination hot spots and cold spots. Nat Rev Genet. 2001, 2: 360-369. 10.1038/35072078.PubMedView ArticleGoogle Scholar
- Drouaud J, Camilleri C, Bourguignon PY, Canaguier A, Berard A, Vezon D, Giancola S, Brunel D, Colot V, Prum B, Quesneville H, Mézard C: Variation in crossing-over rates across chromosome 4 of Arabidopsis thaliana reveals the presence of meiotic recombination "hot spots". Genome Res. 2006, 16: 106-114.PubMedPubMed CentralView ArticleGoogle Scholar
- Jeffreys AJ, Kauppi L, Neumann R: Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet. 2001, 29: 217-222. 10.1038/ng1001-217.PubMedView ArticleGoogle Scholar
- de Massy B: Distribution of meiotic recombination sites. Trends Genet. 2003, 19: 514-522. 10.1016/S0168-9525(03)00201-4.PubMedView ArticleGoogle Scholar
- Paigen K, Szatkiewicz JP, Sawyer K, Leahy N, Parvanov ED, Ng SH, Graber JH, Broman KW, Petkov PM: The recombinational anatomy of a mouse chromosome. PLoS Genet. 2008, 4: e1000119-10.1371/journal.pgen.1000119.PubMedPubMed CentralView ArticleGoogle Scholar
- Baudat F, Buard J, Grey C, Fledel-Alon A, Ober C, Przeworski M, Coop G, de Massy B: PRDM9 is a major determinant of meiotic recombination hotspots in humans and mice. Science. 2010, 327: 836-840. 10.1126/science.1183439.PubMedPubMed CentralView ArticleGoogle Scholar
- Myers S, Bowden R, Tumian A, Bontrop RE, Freeman C, MacFie TS, McVean G, Donnelly P: Drive against hotspot motifs in primates implicates the PRDM9 gene in meiotic recombination. Science. 2010, 327: 876-879. 10.1126/science.1182363.PubMedView ArticleGoogle Scholar
- Parvanov ED, Petkov PM, Paigen K: Prdm9 controls activation of mammalian recombination hotspots. Science. 2010, 327: 835-10.1126/science.1181495.PubMedPubMed CentralView ArticleGoogle Scholar
- Paigen K, Petkov P: Mammalian recombination hot spots: properties, control and evolution. Nat Rev Genet. 2010, 11: 221-233.PubMedPubMed CentralView ArticleGoogle Scholar
- Berg IL, Neumann R, Lam KW, Sarbajna S, Odenthal-Hesse L, May CA, Jeffreys AJ: PRDM9 variation strongly influences recombination hot-spot activity and meiotic instability in humans. Nat Genet. 2010, 42: 859-863. 10.1038/ng.658.PubMedPubMed CentralView ArticleGoogle Scholar
- Berg IL, Neumann R, Sarbajna S, Odenthal-Hesse L, Butler NJ, Jeffreys AJ: Variants of the protein PRDM9 differentially regulate a set of human meiotic recombination hotspots highly active in African populations. Proc Natl Acad Sci USA. 2011, 108: 12378-12383. 10.1073/pnas.1109531108.PubMedPubMed CentralView ArticleGoogle Scholar
- Brick K, Smagulova F, Khil P, Camerini-Otero RD, Petukhova GV: Genetic recombination is directed away from functional genomic elements in mice. Nature. 2012, 485: 642-645. 10.1038/nature11089.PubMedPubMed CentralView ArticleGoogle Scholar
- Hayashi K, Matsui Y: Meisetz, a novel histone tri-methyltransferase, regulates meiosis-specific epigenesis. Cell Cycle. 2006, 5: 615-620. 10.4161/cc.5.6.2572.PubMedView ArticleGoogle Scholar
- Mihola O, Trachtulec Z, Vlcek C, Schimenti JC, Forejt J: A mouse speciation gene encodes a meiotic histone H3 methyltransferase. Science. 2009, 323: 373-375. 10.1126/science.1163601.PubMedView ArticleGoogle Scholar
- Emerson RO, Thomas JH: Adaptive evolution in zinc finger transcription factors. PLoS Genet. 2009, 5: e1000325-10.1371/journal.pgen.1000325.PubMedPubMed CentralView ArticleGoogle Scholar
- Vaquerizas JM, Kummerfeld SK, Teichmann SA, Luscombe NM: A census of human transcription factors: function, expression and evolution. Nat Rev Genet. 2009, 10: 252-263. 10.1038/nrg2538.PubMedView ArticleGoogle Scholar
- Klug A: The discovery of zinc fingers and their applications in gene regulation and genome manipulation. Annu Rev Biochem. 2010, 79: 213-231. 10.1146/annurev-biochem-010909-095056.PubMedView ArticleGoogle Scholar
- Myers S, Freeman C, Auton A, Donnelly P, McVean G: A common sequence motif associated with recombination hot spots and genome instability in humans. Nat Genet. 2008, 40: 1124-1129. 10.1038/ng.213.PubMedView ArticleGoogle Scholar
- Grey C, Barthes P, Chauveau-Le Friec G, Langa F, Baudat F, de Massy B: Mouse PRDM9 DNA-binding specificity determines sites of histone H3 lysine 4 trimethylation for initiation of meiotic recombination. PLoS Biol. 2011, 9: e1001176-10.1371/journal.pbio.1001176.PubMedPubMed CentralView ArticleGoogle Scholar
- Fu F, Sander JD, Maeder M, Thibodeau-Beganny S, Joung JK, Dobbs D, Miller L, Voytas DF: Zinc Finger Database (ZiFDB): a repository for information on C2H2 zinc fingers and engineered zinc-finger arrays. Nucleic Acids Res. 2009, 37: D279-283. 10.1093/nar/gkn606.PubMedPubMed CentralView ArticleGoogle Scholar
- Persikov AV, Osada R, Singh M: Predicting DNA recognition by Cys2His2 zinc finger proteins. Bioinformatics. 2009, 25: 22-29. 10.1093/bioinformatics/btn580.PubMedPubMed CentralView ArticleGoogle Scholar
- Persikov AV, Singh M: An expanded binding model for Cys2His2 zinc finger protein-DNA interfaces. Phys Biol. 2011, 8: 035010-10.1088/1478-3975/8/3/035010.PubMedPubMed CentralView ArticleGoogle Scholar
- Parvanov ED, Ng SH, Petkov PM, Paigen K: Trans-regulation of mouse meiotic recombination hotspots by Rcr1. PLoS Biology. 2009, 7: e1000036-PubMed CentralView ArticleGoogle Scholar
- Baudat F, de Massy B: Cis- and trans-acting elements regulate the mouse Psmb9 meiotic recombination hotspot. PLoS Genet. 2007, 3: e100-10.1371/journal.pgen.0030100.PubMedPubMed CentralView ArticleGoogle Scholar
- Elrod-Erickson M, Rould MA, Nekludova L, Pabo CO: Zif268 protein-DNA complex refined at 1.6 A: a model system for understanding zinc finger-DNA interactions. Structure. 1996, 4: 1171-1180. 10.1016/S0969-2126(96)00125-6.PubMedView ArticleGoogle Scholar
- Buard J, Barthes P, Grey C, de Massy B: Distinct histone modifications define initiation and repair of meiotic recombination in the mouse. Embo J. 2009, 28: 2616-2624. 10.1038/emboj.2009.207.PubMedPubMed CentralView ArticleGoogle Scholar
- Pizzi C, Rastas P, Ukkonen E: Finding significant matches of position weight matrices in linear time. IEEE/ACM Trans Comput Biol Bioinform. 2011, 8: 69-79.PubMedView ArticleGoogle Scholar
- Billings T, Sargent Evelyn, Szatkiewicz Jin, Leahy Nicole, Kwak Il-Youp, Bektassova Nazira, Walker Michael, Hassold Terry, Graber Joel, Broman Karl, Petkov PM: Patterns of recombination activity on mouse chromosome 11 revealed by high resolution mapping. PLoS One. 2010, 5: e15340-10.1371/journal.pone.0015340.PubMedPubMed CentralView ArticleGoogle Scholar
- Hamilton TB, Borel F, Romaniuk PJ: Comparison of the DNA binding characteristics of the related zinc finger proteins WT1 and EGR1. Biochemistry. 1998, 37: 2051-2058. 10.1021/bi9717993.PubMedView ArticleGoogle Scholar
- Cole F, Keeney S, Jasin M: Comprehensive, fine-scale dissection of homologous recombination outcomes at a hot spot in mouse meiosis. Mol Cell. 2010, 39: 700-710. 10.1016/j.molcel.2010.08.017.PubMedPubMed CentralView ArticleGoogle Scholar
- Boulton A, Myers RS, Redfield RJ: The hotspot conversion paradox and the evolution of meiotic recombination. Proc Natl Acad Sci USA. 1997, 94: 8058-8063. 10.1073/pnas.94.15.8058.PubMedPubMed CentralView ArticleGoogle Scholar
- Oliver PL, Goodstadt L, Bayes JJ, Birtle Z, Roach KC, Phadnis N, Beatson SA, Lunter G, Malik HS, Ponting CP: Accelerated evolution of the Prdm9 speciation gene across diverse metazoan taxa. PLoS Genet. 2009, 5: e1000753-10.1371/journal.pgen.1000753.PubMedPubMed CentralView ArticleGoogle Scholar
- Thomas JH, Emerson RO, Shendure J: Extraordinary molecular evolution in the PRDM9 fertility gene. PLoS One. 2009, 4: e8505-10.1371/journal.pone.0008505.PubMedPubMed CentralView ArticleGoogle Scholar
- La Salle S, Sun F, Zhang XD, Matunis MJ, Handel MA: Developmental control of sumoylation pathway proteins in mouse male germ cells. Dev Biol. 2008, 321: 227-237. 10.1016/j.ydbio.2008.06.020.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.