The complex binding of PRDM9
© BioMed Central Ltd 2013
Published: 24 April 2013
Skip to main content
© BioMed Central Ltd 2013
Published: 24 April 2013
A recent study investigates the in vitro DNA binding behavior of PRDM9, a zinc finger protein involved in the localization of recombination hotspots in mammals.
Please see related research article: http://genomebiology.com/2013/14/4/R35
During meiosis, homologous chromosomes need to be paired and to segregate properly in order to ensure the transmission of a correct number of chromosomes per gamete. The search for homology is initiated by the deliberate placement of DNA double-strand breaks along the chromosomes, followed by the rejoining of DNA segments. This process, referred to as meiotic recombination, is both a key source of genetic variation, creating new combination of alleles in the population, as well as an important source of genomic instability, responsible for numerous human diseases. Understanding how the locations of double-strand breaks are determined and how they lead to the shuffling of DNA therefore carries important implications for molecular and evolutionary biology, as well as for medical genetics.
In humans, as in many mammals, recombination events tend to be clustered in small regions of the genome called hotspots. Unexpectedly, given the importance of this process, the location and intensity of hotspots has been shown to vary among humans and between mice strains. Furthermore, hotspots are not shared between closely related species such as humans and chimpanzees, indicating a rapid turnover in their locations.
In the last few years, exciting work from a variety of disciplines has uncovered the central role of the PR domain zinc finger protein 9 (PRDM9) in specifying the location of most hotspots genome-wide. This protein binds specific DNA sequences through its zinc finger domain and tri-methylates histone H3 on lysine 4 (H3K4me3) through its SET domain, which somehow results in the recruitment of the recombination machinery. The PRDM9 zinc finger array is highly variable within and between species, both in terms of the number and identity of its zinc fingers; in humans and mice, this variability leads to differences in the locations and intensity of hotspots. How specific DNA sequences are recognized by the zinc finger domain of PRDM9 is therefore a central question for our understanding of how recombination events are specified. The relationship between PRDM9 variants and the bound motifs appears to be rather complex, however, and much remains to be understood about exactly how DNA recognition is achieved. A recent study by Billings et al.  addressed these questions experimentally by expressing PRDM9 in Escherichia coli. The protein retains its trimethylating activity in vitro, and its binding behavior seems to recapitulate in vivo activity in mice (that is, the protein variants bind in vitro only to the hotspots they are known to activate in vivo), therefore enabling a detailed dissection of the interaction between the mouse protein variants and the DNA sequence.
Billings et al. examined the minimum length of segments bound by PRDM9 at previously known hotspots using gel-shift assays, focusing on three hotspots bound by the PRDM9Cst variant (present in the CAST/Eij mouse strain) and one bound by the PRDM9Dom2 variant (present in the C57BL/6J mouse strain) . PRDM9Cst has 11 fingers and therefore can bind up to 33 base pairs (bp), whereas PRDM9Dom2 has 12 fingers and hence can bind up to 36 bp. The authors found that the four hotspots (pre B cell leukemia homeobox 1 (Pbx1), H2.0-like homeobox (Hlx1), estrogen-related receptor gamma (Esrrg-1) and proteasome (prosome, macropain) subunit, beta type, 9 (Psmb9)) present a minimum-binding site between 30 and 34 bp, suggesting that PRDM9 uses all its zinc fingers to bind DNA. Importantly, as the authors note, the use of the full complement of PRDM9's zinc fingers for binding suggests that PRDM9 binds continuously to DNA for more than one helical turn, a conclusion further supported by the finding that binding is inhibited by the addition of Mg2+.
For one hotspot (Hlx1), Billings et al. further mutated each of the 31 positions of the binding site. This revealed great variability in specificity among bases, with high specificity for those matching the first half of the zinc finger domain (especially the fingers 4 to 6). On that basis, they propose that the other, less specific fingers are used to stabilize the protein-DNA complex.
Billings et al. then compare the sequences bound by PRDM9Cst at the three hotspots analyzed, finding that the three sequences have little in common, even though they are able to compete with each other for binding to PRDM9 in gel-shift assays. Furthermore, the few matches seem equally distributed among different fingers. One interpretation is that PRDM9 is able to bind to a mixture of different motifs, thus resolving the apparent paradox that PRDM9 is at the same time very permissive (in that it can bind to degenerate versions of the consensus motif) and highly sensitive (in that specific mutations can completely knock down hotspot activity ). This hypothesis would also explain why, in humans, PRDM9 was shown to influence both hotspots containing and lacking an exact match to the consensus motif [6, 7], and would further suggest that there is a greater variety of target sequences in Western chimpanzees, where no consensus motif could be found , than in humans. More generally, these results raise the question of how many distinct motifs coexist for PRDM9 in humans and other species.
As the authors state, the intriguing results of this study raise as many questions as they answer. Notably, the study reports that instances of the same zinc fingers repeated along the protein do not share the same DNA specificity, and so the protein-DNA interaction seems to be highly context-dependent. The source of this dependence, however, remains unclear, highlighting that we still have a limited understanding of the behavior of long zinc fingers, or perhaps that PRDM9 presents unusual features.
To deepen our understanding of this enigmatic protein, we need to know where it binds in vivo genome-wide (for example, through PRDM9 ChIP-seq data), and to compare these locations with those of double-strand breaks (for example, as identified in mice through ChIP-seq of disrupted meiotic cDNA 1 homolog (DMC1) by Smagulova et al. ). It would also be helpful to characterize how PRDM9 binding is affected by chromatin organization or by the presence of other co-factors. Such analyses would help to understand why, even though there are hundreds of thousands of motifs for any version of PRDM9, only a small subset yields double-strand breaks. In turn, an answer to this question may help to understand what constraints restrict the state space of possible motifs to which PRDM9 could bind. In the longer term, the goal is to be able to predict, for a given variant, where double-strand breaks will occur. Such a complete understanding may be of practical use in engineering specific breaks in the genome and, regardless, will yield important insights into evolutionary biology and human genetics.
chromatin immunoprecipitation followed by high throughput sequencing
PR domain zinc finger protein 9.
I would like to thank Molly Przeworski for helpful discussions and comments on this manuscript.