- Open Access
Gene editing of the multi-copy H2A.B gene and its importance for fertility
- Nur Diana Anuar1,
- Sebastian Kurscheid†1, 6,
- Matt Field†1, 2, 6,
- Lei Zhang3,
- Edward Rebar3,
- Philip Gregory3, 4,
- Thierry Buchou5,
- Josephine Bowles†6,
- Peter Koopman†6,
- David J. Tremethick1Email author and
- Tatiana A. Soboleva1Email author
© The Author(s). 2019
- Received: 8 April 2018
- Accepted: 16 January 2019
- Published: 31 January 2019
Altering the biochemical makeup of chromatin by the incorporation of histone variants during development represents a key mechanism in regulating gene expression. The histone variant H2A.B, H2A.B.3 in mice, appeared late in evolution and is most highly expressed in the testis. In the mouse, it is encoded by three different genes. H2A.B expression is spatially and temporally regulated during spermatogenesis being most highly expressed in the haploid round spermatid stage. Active genes gain H2A.B where it directly interacts with polymerase II and RNA processing factors within splicing speckles. However, the importance of H2A.B for gene expression and fertility are unknown.
Here, we report the first mouse knockout of this histone variant and its effects on fertility, nuclear organization, and gene expression. In view of the controversy related to the generation of off-target mutations by gene editing approaches, we test the specificity of TALENs by disrupting the H2A.B multi-copy gene family using only one pair of TALENs. We show that TALENs do display a high level of specificity since no off-target mutations are detected by bioinformatics analyses of exome sequences obtained from three consecutive generations of knockout mice and by Sanger DNA sequencing. Male H2A.B.3 knockout mice are subfertile and display an increase in the proportion of abnormal sperm and clogged seminiferous tubules. Significantly, a loss of proper RNA Pol II targeting to distinct transcription–splicing territories and changes to pre-mRNA splicing are observed.
We have produced the first H2A.B knockout mouse using the TALEN approach.
- Histone variants
- RNA polymerase II
- Pre-mRNA splicing
- Splicing speckles
- Genome editing
Histones and DNA form the core structure of chromatin, the nucleosome core, in which ~ 145 base pairs of DNA are wrapped around a histone octamer comprising of a (H3-H4)2 tetramer flanked by two H2A-H2B dimers. Importantly, the structure and the protein interaction surface of a nucleosome can be altered by the substitution of one or more of the major core histones with their variant forms to regulate gene expression and other DNA-dependent processes.
Despite being discovered over a decade ago , the in vivo importance of the mammalian histone H2A variant H2A.B remains unknown. H2A.B appeared late in evolution in mice (H2A.B.3) and humans (H2A.B), and it is predominantly expressed in the testis with low expression levels in the brain [2, 3]. In the testis, it is expressed from the pachytene stage of the prophase of meiosis until the haploid round spermatid stage where its expression peaks. The expression of H2A.B.3 is coincident with the highest level of transcription in the testis.
A striking in vitro feature of H2A.B/B.3 is their ability to completely unfold chromatin thus permitting high levels of RNA Pol II transcription [2, 3]. In the testis, ChIP-Seq and RNA-Seq experiments revealed that H2A.B.3 was incorporated both at the transcription start site and the gene body of a gene and this incorporation was positively correlated with gene expression. Consistent with these locations, ChIP-mass spec experiments revealed that H2A.B.3 directly interacts with the initiation and elongation forms of RNA Pol II and RNA processing factors . Therefore, H2A.B may enhance transcription directly by decompacting chromatin and/or indirectly by recruiting RNA Pol II to an active gene. Most interestingly, the nuclei of round spermatids revealed a novel nuclear organization where large domains of H2A.B.3-containing chromatin exist, which colocalize with splicing speckles and the initiation and elongation forms of RNA Pol II indicating that active transcription occurs within splicing speckles . Taken together, an attractive hypothesis is that RNA Pol II might be recruited to splicing speckles by H2A.B.3  to facilitate high levels of transcription.
The availability of a H2A.B.3 knockout would therefore be an invaluable tool for investigating the importance of H2A.B in fertility and testing the above hypothesis. However, as the H2A.B.3 protein is expressed from three different genes, this is a technically challenging feat. Our aim was therefore to first devise the most efficient and specific strategy to inhibit the function of all three genes.
Previously, transcription activator-like effector nucleases (TALENs) were the choice to perform gene and genome editing [4, 5], but more recently this technology has been superseded by the CRISPR-Cas9 system . However, while improvements are continually being made [7–9], one issue has been the extent of off-target mutants generated with the use of CRISPR-Cas9, which appears to be greater when compared with TALENs [10–14]. One attractive feature of TALENs, which enables a higher level of specificity, is that the Fok1 nuclease domain will only cleave DNA when dimerized, which occurs when the two TALE domains bind to DNA (on opposite strands) in close proximity to each other. Here, we used TALENs to genetically impair the function of all three H2A.B.3 encoding genes; moreover, we used only one pair of TALENs to do this. To our knowledge, this has not been done before. This KO H2A.B.3 mouse displayed a subfertile phenotype and revealed a correlation between the proper RNA Pol II targeting to distinct transcription–splicing territories and pre-mRNA splicing outcomes.
The design, activity and specificity of TALENs
There are three H2A.B.3-encoding genes (H2Afb3, Gm14920, H2Afb2, which are > 92% identical) plus a pseudogene (Gm14904) all located on the X chromosome in the mouse (the pseudogene was present in the Ensemble release 57 but subsequently removed) (Additional file 1: Table S1). We confirmed the expression of the three H2A.B.3-encoding genes in the testis, and as expected, the pseudogene was not expressed (Additional file 2: Figure S1). In order to test the specificity of TALENs against the H2A.B.3 gene family, two groups of H2A.B.3-targeting TALENs were designed. The first group of 12 TALENs was designed to target only one gene, H2Afb3. The second group of 7 TALENs was designed to target all three H2A.B.3 genes (Additional file 3: Table S2).
The activity and specificity of the TALENs were tested by employing a Dual-Luciferase Single-Strand Annealing Assay (DLSSA) and by the Cel 1 cleavage assay. The TALEN plasmids were co-transfected into Neuro2a cells in pairs in various combinations (Additional file 4: Figure S2). The results for the first group showed that all TALEN pairs, in different combinations, had the highest activity for the H2Afb3 gene, with some TALEN pairs indeed only cleaving the H2Afb3 gene (e.g., 101408:101412). The second group of TALENs showed nuclease activity for all genes, with two pairs (101421:101422 and 101421:101423) having the highest activity. Cel 1 assays confirmed these findings (Additional file 5: Figure S3). These in vitro results demonstrate the high specificity of TALENs; some of the targeted DNA sequences of the first group, designed against the H2Afb3 gene, differed by only two to three base pairs compared to the second group, and yet specificity for only the H2Afb3 gene was achieved (Additional file 6: Figure S4). The TALEN pair 101421:101422 were used to knock out (KO) the function of all three H2A.B.3 genes in mice (Additional file 6: Figure S4).
Prior to creating the H2A.B.3 KO, we determined whether any off-target genomic locations, which have limited sequence homology to the TALEN pair 101421:101422, can be identified by using NCBI BLAST against the mouse reference genome mm10 built on the C57BL/6J strain. Even relaxing this BLAST search to only 8 bp out of the 15 bp DNA recognition sequence returned no matches (data not shown).
Production of H2A.B.3 KO mice
Genotype of all 19 pups born following TALEN injections
Genotype by Sanger sequencing
X ∆5 + 286 Y
X ∆15 Y
X ∆17 Y
X ∆10 X ∆10
X ∆12 X ∆12
Chimera of b/∆/c
X ∆15 X ∆15
Chimera of c/∆/b
X ∆22 X ∆160
X ∆5 X ∆17
X ∆5 X ∆5
X ∆65 Y
X C/T, A/G Y
X ∆26 Y
X ∆10 Y
X ∆10 Y
X ∆28 X ∆5
X ∆10 X ∆10
X ∆64 X ∆64
X ∆15 X ∆10
No successful PCR amplification
No successful PCR amplification.
X ∆17 X ∆17
X ∆5 X ∆5
X ∆5 X ∆5
X ∆16 Y
X ∆17 Y
No successful PCR amplification.
Most of the mutants carried small deletions. Interestingly, in two founders, L90-4 and L90-8, at least one of the H2A.B.3 genes (usually gm14920 or H2Afb2) was not amplifiable by PCR. However, when we used mixed primer pairs for PCR amplification (e.g., gm14920-forward and H2Afb2-reverse), we were able to detect an amplification product.
Subsequent Sanger DNA sequencing demonstrated that a fusion between gm14920 and H2Afb2 had occurred (data not shown). Unexpectedly, when the founder L74-5 mouse, which only displayed the deletions and insertions described in Table 1, was bred with a wild type (wt) female, the resulting G1 progeny also displayed a chimera between the gm14920 and H2Afb2 genes (Additional file 7: Figure S5). Continued breeding from these chimeric G1 H2A.B.3−/x females with wt males (up to four generations), revealed that these additional mutations were lost after G2, but maintained original founder mutations (H2Afb3 X∆+286, gm14290 X∆15, H2Afb2 X∆17). This can be explained by the fact that TALEN activity can be retained for several cell divisions following their injection into the two-cell stage, thus producing a mosaic genotype, as previously observed [15, 16]. Collectively, the G1 generation of the nine founders H2A.B.3 genetically modified mice produced 22/78 mice displaying mosaicism, but by G3 a pure non-mosaic mouse colony was derived.
Approach to identify possible TALEN-induced off-target mutations
The strategy for the identification of possible TALEN-induced off-target mutations in H2A.B.3−/y mice was based on the sequencing of exomes of three related non-mosaic H2A.B.3−/y mice from three consecutive generations (G1-G3). This strategy avoids the problem observed in previous studies where unrelated mice were compared to assess off-target effects . Each exome sample was sequenced using 100-bp paired-end reads on the Illumina HiSeq 2000 sequencer to a depth of 350–600× coverage, yielding between 129 × 106 and 228 × 106 reads per sample (Additional file 8: Table S3). The breeding began by crossing ♀L90-3 with ♂L74-5 to produce phenotype NM4-G1 H2Afb3 (X∆5Y), gm14920 (X∆10Y), and H2Afb2 (X∆64Y) mice. Subsequent crosses between genetically modified H2A.B.3 female mice and wt males produced NM4-G2 and G3 H2Afb3 (X∆5Y), gm14920 (X∆10Y), and H2Afb2 (X∆64Y) mice for exon sequencing (Additional file 9: Figure S6). By crossing mutant females with wt males, any true off-target mutations would be identified because they would be diluted by 50% after each generation. On the other hand, natural sequence variations due to mouse strain differences would not be diluted by breeding (see below). Sequenced mouse exomes were run through our in-house variant detection pipeline (see Methods) to detect single nucleotide variants (SNVs), small indels, and larger structural variants. Importantly, TALENs do not usually produce SNVs, thus this type of variant serves as an internal measure for mouse strain differences.
As expected, alignment with the mm10 genome identified a large number of SNVs (~ 1.7 × 106 for G1 and G2, and ~ 0.69×106 for G3) and indels (~ 1.7 × 105 for G1 and G2, ~ 0.6 × 105 for G3) (Additional file 10: Table S4); however, neither followed a dilution pattern but rather correlated with the original number of sequencing reads. To remove the effect of mouse strain differences, a filter for FVB/NJ was included into the alignment, which indeed greatly reduced the number of SNVs and indels (Additional file 11: Table S5). Again, the remaining SNVs and putative indels did not follow a dilution pattern following successive generations.
Experimental validation showing that indels were incorrectly predicted
An additional analysis was conducted to separate heterozygous from homozygous indels . As shown (Additional file 12: Table S6), after filtering against the FVB/NJ genome, 417 NM4-G1, 420 NM4-G2, and 325 NM4-G3 heterozygous (0/1 + 0/2 + 1/2) indels remained after alignment with the mm10 genome, but again did not display a 50% dilution pattern from G1-G3. To provide further evidence that these remaining indels are due to strain differences between FVB/NJArc and FVB/NJ mice, we interrogated the 19 indels common to all three NM4 G1-G3 mice (Additional file 13: Figure S7, the location of these putative indels are shown in Additional file 14: Table S7).
H2A.B.3 KO mice are subfertile
H2A.B.3 KO mice have defective sperm
Dynamics of TP1 chromatin incorporation and removal is altered in H2A.B.3 KO mice
Histone replacement by protamines, which requires transition proteins, is essential for the successful production of sperm. Interestingly, transition protein 1 (TP1)-deficient mice are subfertile and produce high proportion of defective sperm that appear to resemble the abnormal sperm observed from H2A.B.3−/y mice (sperm with tails coiled around their heads) (Fig. 3) . Therefore, we wondered, could the loss of H2A.B.3 affect the dynamics of TP1 incorporation?
Transition proteins and the histone H2A variant H2A.L.2 are co-expressed, and recently, it was shown that the proper loading of TP1 was dependent upon the prior incorporation of H2A.L.2 . We therefore investigated whether the loss of H2A.B.3 might affect the localization of H2A.L.2 in condensing spermatid nuclei where it is maximally expressed. As reported previously, in wt spermatids, H2A.L.2 is incorporated into all chromatin including pericentric chromatin (Fig. 4c) . Strikingly, in KO spermatids, H2A.L.2 is no longer enriched in pericentric heterochromatin in the majority of condensing spermatids indicating that indeed the proper incorporation of H2A.L.2 is impaired (Fig. 4c, d). This defect in H2A.L.2 localization is not due to changes in its expression (data not shown). Taken together, it is plausible to suggest that this misincorporation of H2A.L.2 might contribute to the altered TP1 binding observed in H2A.B.3-deficient elongating/condensing spermatids.
H2A.B.3 KO mice have clogged seminiferous tubules
Localization of RNA polymerase II within splicing speckles is perturbed in H2A.B.3 KO round spermatids
Previously, we observed a striking nuclear organization in round spermatids where distinct domains of H2A.B.3-containing chromatin colocalize with splicing speckles. Moreover, these H2A.B.3-splicing speckle domains appear to be transcriptionally active based on its colocalization with the initiation and elongation forms of RNA Pol II . Further, given that H2A.B.3-containing nucleosomes directly interact with these forms of RNA Pol II , we investigated whether the loss of H2A.B affects (1) the formation of splicing speckles and/or (2) the localization of RNA Pol II to splicing speckles.
To investigate the first possibility, we used the splicing speckle marker, Y12, to identify splicing speckles in the H2A.B.3−/y versus wt rounds spermatids. The Y12 immunostaining patterns show that the formation and morphology of splicing speckles, defined as distinct Y12 foci that colocalize with DAPI-depleted regions, was not altered in H2A.B.3−/y round spermatids (Additional file 16: Figure S9).
The level of H3K4me3 increases in H2A.B.3 KO euchromatin
Next, we investigated whether the loss of H2A.B.3 had any impact on the abundance of post-translational modifications associated with heterochromatin (H3K9me2) or euchromatin (H3K4me3) (Additional file 17: Figure S10). Compared to wt round spermatids, there is no change in the abundance of H3K9me2 in euchromatin in H2A.B.3−/y round spermatids (Additional file 17: Figure S10a). Strikingly, in the absence of H2A.B.3, there is a marked accumulation of H3K4me3 in euchromatin compared to wt round spermatids (Additional file 17: Figure S10b). On the other hand, we did not observe an increase in H3K4me2 or H4K8ac (data not shown).
To confirm this, we measured the fluorescence intensity of the H3K9me2 or H3K4me3 signal in the heterochromatic chromocenter (CC) and in the surrounding euchromatin (EU) and calculated their ratio (CC − B)/(EU − B), where B is background autofluorescence. This ratio would decrease if euchromatin contained more H3K9me2 or H3K4me3 than the heterochromatic chromocenter. We also scored two stages of round spermatid differentiation, the early round spermatid (ERS) and late round spermatid (LRS) stage. Indeed, the average (CC − B)/(EU − B) H3K4me3 ratio was smaller in H2A.B.3−/y round spermatids compared to wt round spermatids whereas the ratio for H3K9me2 did not change significantly (Additional file 17: Figure S10). Recently, it was shown that the removal of the testis H2B histone variant TH2B induces compensatory mechanisms including lysine crotonylation and arginine methylation . Given the existence of such redundant epigenetic mechanisms to ensure reproduction, it is attractive to speculate that this increase in H3K4me3 might be one mechanism that compensates for the loss of H2A.B.3.
Effect of H2A.B.3 loss on transcription and splicing
Given that H2A.B.3 is incorporated into the chromatin of an active gene, we next performed a comprehensive differential gene expression analysis on purified round spermatids using high-throughput 75-bp paired-end sequencing of poly(A)-selected RNA (in triplicate but each mRNAseq library contained mRNA combined from two wt or H2A.B.3 KO mice). One notable initial observation from our analyses was the large variation in the RNA expression profile between different H2A.B.3 KO biological replicates compared to wt replicates, as indicated by a principal component analysis (Additional file 18: Figure S11a). Within this background of a large variation in gene expression between H2A.B.3 KO biological replicates, a differential gene expression volcano plot reveals that the expression of 86 genes are significantly altered in the absence of H2A.B.3 (Additional file 18: Figure S11b; Additional file 19: Table S8).
Differential splicing events when comparing H2A.B.3 KO with their wild type siblings
Differential transcript usage
Alternative 5′ splice site
Alternative 3′ splice site
Alternative first exon
Alternative last exon
Use of mutually exclusive exons
A major ongoing question is whether a RNA (e.g., CRISPR)- or protein (e.g., TALEN)-based mechanism will ultimately prove to be more specific for genome editing and this will be critical for future medical therapies. In this study, we have successfully applied the TALEN technology, whereby we used one pair of TALENs to specifically and simultaneously disrupt three gene copies of a gene family. Many human diseases are caused by gene copy number variations , and therefore, the use of a limited number of genome-editing enzymes might help in the future treatment of such diseases. The use of only one pair of TALENs, rather than multiple pairs, reduces the complexity of the TALEN approach (and hence the cost) and would reduce the likelihood of mosaicism and, most importantly, reduce the possibility of off-target mutations. Indeed, our rigorous exome analyses did not reveal any TALEN-induced off-target mutations. We also found that the computational tools used to predict putative SNVs and indels have limitations, and therefore wet-lab validation is required before any conclusions can be made about off-target mutations.
The TALEN approach did expose several interesting phenomena. Firstly, the simultaneous knockout of highly homologous H2A.B.3 genes leads to the formation of chimeras between these genes. As every chimera contained a small deletion, it is likely that terminal microhomology, a mechanism of NHEJ, was involved . It is important to point out that other technologies that rely on NHEJ for inducing mutations, like CRISPR/Cas9 technology, will also likely produce chimeras between highly homologous genes. Several rounds of backcrossing can successfully eliminate such small-scale genomic rearrangements. Secondly, we observed an all-or-nothing effect whereby the TALENs either modified all three H2A.B.3 genes or modified none, which reduced the need for a more complex breeding and screening strategy to establish a H2A.B.3 KO colony.
Having successfully produced a H2A.B.3 knockout enabled us to begin to elucidate the importance of H2A.B.3 for fertility for the first time. H2A.B.3 KO mice are subfertile and consistent with this, our investigations reveal that no single major abnormality can be attributed to the loss of H2A.B.3 but instead several more subtle defects are observed, i.e., and increase in the proportion of defective sperm and clogged seminiferous tubules. Further, we do not know if the gene expression and RNA-splicing changes contribute to these defects. The fact that some H2A.B.3−/y mice are as fertile as wt mice, whereas other male mice are extremely infertile hints at a possible compensatory mechanism (as observed for other histone variant knockouts ) that cannot always compensate for the loss of H2A.B.3.
It is attractive to speculate that the observed increase in H3K4me3 in H2A.B.3 KO round spermatid nuclei may be involved in such a compensatory mechanism, which might explain why our differential gene expression analysis did reveal a major change. On the other hand, the apparent large variation in the gene expression profile between individual H2A.B.3 KO testes suggests that, if H3K4me3 is involved in this compensation mechanism, its ability to compensate is stochastic, which could also explain why some H2A.B.3 KO male mice are fertile and others are less so.
It is interesting that TP1 KO mice display defective sperm that appear to resemble the abnormal sperm observed from H2A.B.3 KO mice, i.e., sperm with tails coiled around their heads . This prompted us to examine TP1 chromatin loading and subsequent release and indeed, the timing of this process is altered in H2A.B.3 KO elongating/condensing spermatids. Furthermore, given that the proper loading of TP1 into euchromatin and heterochromatin requires prior assembly of H2A.L.2, we also investigated whether the incorporation of H2A.L.2 into chromatin was disrupted in the absence of H2A.B.3. The proper assembly of H2A.L.2 into chromatin was significantly impaired with a notable depletion in pericentric heterochromatin. This suggests that the misincorporation of H2A.L.2 might indeed contribute to the altered dynamics of TP1 binding and release. It is interesting that the function of one histone variant, H2A.B.3, can alter the function of another histone variant, H2A.L.2. The nature of this relationship will be the subject of further investigations.
A major finding is that in the absence of H2A.B.3, RNA Pol II no longer occupies the same territory as splicing speckles becoming more diffuse within round spermatid nuclei. It is therefore attractive to hypothesize that H2A.B.3 might have an important role in the recruitment and/or anchoring of RNA Pol II to active territories of transcription and splicing. This is consistent with previous mass spec data where it was shown that chromatin bound H2A.B.3 directly interacts with RNA Pol II . Significantly, this loss of RNA Pol II localization is correlated with changes in pre-mRNA splicing.
We have produced the first H2A.B knockout mouse using the TALEN approach. All three H2A.B.3 genes, were successfully genetically modified to inhibit their expression by using only one pair of TALENs. We also describe a robust approach to search for any off-targets and report several interesting TALEN-related phenomena applicable if other multi-copy genes are being considered for genome editing. H2A.B.3 KO male mice were subfertile and reveal several altered phenotypes that could account for this subfertility; an increase in the proportion of abnormal sperm and seminiferous tubules that are clogged. At the molecular level, TP1 chromatin loading and unloading is impaired. Most significantly, a loss of proper RNA Pol II targeting to distinct transcription–splicing territories and changes to pre-mRNA splicing were observed.
The TALENs were assembled using the method previously described [5, 22]. The TALEN target sites were searched using a computer script developed in Sangamo. The four most common repeat variable di-residue (RVD), NI, HD, NN, and NG, were used to recognize bases A, C, G, and T, respectively. The NK RVD was sometimes used to specify the 3′ half repeat T base. TALENs were generated by linking several pre-made TALE repeat blocks in the forms of monomer, dimer, trimer, or tetramer. The desired repeat blocks were PCR amplified using position-specific primers. The PCR products were pooled, digested with BsaI, and ligated into the TALEN expression vector, pVAXt-3Flag-NLS-TALE-Bsa3-C63-FokI, which had been linearized with the BsaI restriction enzyme. Ligations were transformed into E. coli competent cells, and single clones were picked and sequence verified.
Dual-Luciferase Single-Strand Annealing Assay
The Dual-Luciferase Single-Strand Annealing Assay (DLSSA) is based on the Dual-Luciferase Reporter Assay System from Promega (Madison, WI). In the DLSSA, the firefly luciferase reporter gene is split into two inactive fragments with overlapping repeated sequences separated by stop codons and one of the three H2A.B.3 genes. Introduction of a double strand break in the H2A.B.3 gene by the TALEN pair initiates recombination between the flanking repeats by the single-strand annealing pathway and produces an active luciferase gene. Neuro2a cells (ATCC #CCL-131) were cultured in Dulbecco’s modified Eagle’s medium (DMEM, Cellgro Cat#: 10-013-CV) plus 10% FBS and 2 mM L-Glu (PSG optional antibiotic). One day before transfection, 20,000 cells per well were seeded into a 96-well plate. Cells were transiently transfected using Lipofectamine 2000 (Thermo Fisher Scientific) with four plasmids including a pair of TALENs, the SSA firefly luciferase reporter, and the internal control Renilla luciferase reporter (6.25 ng DNA for each component). The luciferase assay was performed 24 h post-transfection according to the manufacturer’s instructions. TALEN activity was measured as the ratio of the firefly and the Renilla luminescence units.
Surveyor nuclease assay
The assay has been described in detail previously . Briefly, mouse Neuro2A cells (2 × 105) were seeded into a 96-well plate the day before transfection. The cells were then transfected with a pair of TALEN DNAs (400 ng of each) using Lipofectamine 2000 (Thermo Fisher Scientific). The genomic DNA was purified 48 h post-transfection. Gene-specific primer pairs were used to amplify each of the H2A gene variants. The primer sequences are H2Aafb3 forward (5′-CAGCAGAAAGCAGCCAAGTGG) and reverse (5′-GCAGGTCAGCCAAGAAGCA); Gm14920 forward (5′-GTACGGTACAAAGGGAG ATG) and reverse (5′-GAGCAGGTCAGCCAAGCAGAG); H2afb2 forward (5′-CAGGTCAGCAGAGAGCAATT) and reverse (5′-CTCCATACTGCTGTAGACCT); and pseudo Lap forward (5′-GTCAGCAGAATGCAGCCAAATAT) and reverse (5′-CAAGCCAGTAGCCAACATCAAG). The PCR products were treated with Surveyor Nuclease (Cel-1) and resolved on PAGE. The nuclease activity of TALENs was measured by quantifying the proportion of the cleaved DNA fragments.
TALEN mRNA synthesis
Plasmids (in Sangamo backbone) were linearized with XbaI and the products were purified (PCR Cleanup kit, Qiagen). Capped and poly(A)-tailed mRNA transcripts were produced using a mMESSAGE mMACHINE T7 Ultra Transcription Kit (Life Technologies) following the manufacturer’s instructions. mRNA was purified using a MEGAclear kit (Life Technologies) and was eluted in RNase-free water, and single-use aliquots were frozen.
Production of TALEN-mediated genetically modified mice
Prior to injection, mRNA was diluted to 10 ng/μl in filtered RNase-free microinjection buffer (10 mM Tris-HCl, pH 7.4, 0.25 mM EDTA). TALEN mRNA was delivered into the pronucleus of single-cell embryos (FVB), using standard techniques. Injected embryos were cultured overnight to the two-cell stage, then surgically transferred into oviducts pseudopregnant CD1 mice, using standard techniques.
Analysis of potential founders and genotyping of successive generations
Transgenic founders with mutated H2A.B.3 genes were identified among live born mice by PCR (Additional file 30: Table S17) genotyping of ear notch tissue. Genomic DNA was extracted from ear punches using Quick Extract Solution (Epicentre) following the manufacturer’s instructions. The H2afb3, gm14920, and H2afb2 genes were amplified in 25 μl volume using the same gene-specific primers as for the cel1 assay. PCR mix (1× Platinum Taq buffer, 1.5 mM Mg2Cl; 0.2 mM dNTP; 0.2 μM of forward and reverse primer; 2 units of Platinum Taq polymerase (Thermo Fisher)) was amplified using 28 cycles of 95 °C 30 s denaturation, followed by 56–65 °C, 30 s annealing, and 72 °C, 4 min extension.
Exome sequencing and bioinformatic analyses
The male mice that were chosen for exome sequencing were genotyped prior to sequencing to confirm that all three generations had the same H2A.B.3 genotype (H2Afb3 (X∆5Y), gm14920 (X∆10Y) and H2Afb2 (X∆64Y)). The purified gDNA was subjected to exome enrichment using the Agilent SureSelectXT2 All Exon Kit. The exome capture efficiency was uniformly high with approximately 45–50% of all DNA sequenced being exonic. Sequenced mouse exomes were analyzed using our in-house variant detection pipeline described in detail previously [23, 24]. In short, reads were aligned to the mouse reference genome mm10 using BWA  and sorted BAM alignment files were generated using SAMTools . Candidate PCR duplicate reads were annotated using Picard software (http://broadinstitute.github.io/picard/), specifically the MarkDuplicate command. SNVs and small indels were called using SAMTools Mpileup , and variants were annotated using the Variant Effect Predictor . To detect larger structural variants, both Pindel  and Janda (unpublished; https://sourceforge.net/projects/janda/) were utilized. To reduce the number of total variants called, variants specific to the FVBN mouse strain were downloaded from the Sanger Institute ftp site (ftp://ftp-mouse.sanger.ac.uk/REL-1206-FVBNJ/). As these variants were originally reported relative to reference genome mm9, the UCSC liftOver tool (https://genome.ucsc.edu/cgi-bin/hgLiftOver) was used to map the FVBN variants to mm10. These FVBN-specific variants were annotated and not considered for further off-target analysis.
For the detection of INDELs in the TALEN mouse exome sequencing data, we followed the GATK recommendations “Best Practices for Germline SNP & Indel Discovery in Whole Genome and Exome Sequence” . Briefly, the alignment files in BAM format were realigned and base re-calibration was performed. The cleaned alignments of all sequenced mouse exomes were then used as input for the “HaplotypeCaller” program resulting in raw variants in VCF format, containing both SNVs and INDELs. We then performed hard filtering of the called variants to only include INDELS and applied a quality cut-off score of 100 to include only high-confidence INDEL calls. In order to identify potential TALEN off-targets, we inspected the frequency of INDELs in each generation of TALEN mice, expecting a “dilution” effect due to back-crosses.
In order to search the mm10 reference genome for potential off-target sites, we used the in silico PCR function of the UCSC Genome Browser at http://genome.ucsc.edu/cgi-bin/hgPcr . We used the TALEN target sequences as forward and reverse primers, and relaxed the minimum perfect match parameter to eight nucleotides (nt) and constrained the maximum product size to 100 bp.
Experimental testing of putative indels
To interrogate whether the 19 putative indels are present in NM4 G1-G3 mice, the gDNA of wild type FVB/NJArc in-house mice and NM4 G1-G3 KO mice were amplified with primers surrounding the predicted indels (listed in Additional file 31: Table S18). The PCR products were analyzed by electrophoresis (7% acrylamide gel) and by Sanger sequencing.
Immunohistochemistry and testis surface spread preparations
The testes from 30-day-old mice (82% of tubules contain round spermatids at this stage) were removed and subjected to immumohistochemistry and DAPI staining as previously described .
Purification of pure round spermatids from adult mouse testis
The purification was performed exactly as described elsewhere . Briefly, one or two male mice were used per purification. The animals, aged between 8 and 10 weeks, were sacrificed, testes extracted, and albuginea removed. The seminiferous tubules were digested for 15 min at 35 °C with 1 mg/ml of collagenase in PBS, supplemented with lactate and glucose. The cells were washed and further dissociation was performed by vigorous pipetting for 10 min on ice. The cells were twice filtered through a 100-μm filter and loaded onto the 2–4% BSA gradient (in 360 ml). The sedimentation of cells was allowed to occur for 75 min. The cells were collected into 10-ml fractions, and each fraction was analyzed under the phase-contrast microscope. The fractions that contained only round spermatids were pooled together and used for mRNA purification.
Fblim1 FW GTCTGTGGTTTCTGTCACAAG
Fblim1 REV GGTAGCAGGGTTCGCACAG
Slc37a3 FW GGACTCTGCTAGGCTTCATC
Slc37a3 REV CTATACCCAACAAGGGACCC
Styx FW CATTCAGCAATCACTCCTGC
Styx REV GGTTTTCTTAATGTTAGTAGGAG
Traf7 FW GTCTGGAGTATGGACAACATG
Traf7 REV CTTCACGGTGCTATCCACAG
Antibodies used in this study
Antibodies used in this study were rabbit polyclonal anti-TP1 (ab73135, IF 1:100), rabbit serum anti-H2A.L.2 (gift from S Khochbin, IF 1:50), rabbit polyclonal anti-Pol II S2 (ab5095, IF 1:100), mouse monoclonal anti-Y12 (gift from B Westman, IF 1:200), rabbit polyclonal anti-H3K4me3 (ab8580, IF 1:100), and mouse monoclonal anti-H3K9me2 (ab1220, IF 1:100).
We thank Tara Davidson for assistance with microinjections of the TALENs and Dan Andrews for bioinformatics support in analyzing the exome sequencing data. We acknowledge the excellent high-throughput DNA sequencing service provided by our in-house Bimolecular Research Service headed by Stephanie Palmer, and the JCSMR mouse containment suit technicians responsible for the breeding and care of the mice. We especially thank Saadi Khochbin for providing us with H2A.L.2 antibodies.
This work was supported by the National Health and Medical Research Council NHMRC (grant number 1058941).
Availability of data and materials
RNA-Seq data (ArrayExpress accession E-MTAB-7277) .
TS and DT conceived the study and wrote the paper. TS devised the analyses. MF and SK performed exome analyses. SK also performed the gene expression and differential splicing analyses. LZ, ER, and PG designed the TALENs and performed the Dual-Luciferase Single-Strand Annealing Assay (DLSSA) and the cel1 cleavage assays. JB and PK generated the TALEN founder mice. NDA performed the screening, genotyping, and cel1 assays, prepared exome-seq samples, produced the KO mice, and performed all the immunostaining experiments. TB provided the technical expertise and the experimental setup to purify the round spermatids. All authors read and approved the final manuscript.
Ethics approval and consent to participate
All mouse work was conducted according to protocols approved and authorised by the University of Queensland Animal and the Australian National University Ethics Committee (Protocol A2014/33).
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Chadwick BP, Willard HF. A novel chromatin protein, distantly related to histone H2A, is largely excluded from the inactive X chromosome. J Cell Biol. 2001;152:375–84.View ArticleGoogle Scholar
- Soboleva TA, Nekrasov M, Pahwa A, Williams R, Huttley GA, Tremethick DJ. A unique H2A histone variant occupies the transcriptional start site of active genes. Nat Struct Mol Biol. 2012;19:25–30.View ArticleGoogle Scholar
- Soboleva TA, Parker BJ, Nekrasov M, Hart-Smith G, Tay YJ, Tng WQ, Wilkins M, Ryan D, Tremethick DJ. A new link between transcriptional initiation and pre-mRNA splicing: the RNA binding histone variant H2A.B. PLoS Genet. 2017;13:e1006633.View ArticleGoogle Scholar
- Hockemeyer D, Wang H, Kiani S, Lai CS, Gao Q, Cassady JP, Cost GJ, Zhang L, Santiago Y, Miller JC, et al. Genetic engineering of human pluripotent cells using TALE nucleases. Nat Biotechnol. 2011;29:731–4.View ArticleGoogle Scholar
- Miller JC, Tan S, Qiao G, Barlow KA, Wang J, Xia DF, Meng X, Paschon DE, Leung E, Hinkley SJ, et al. A TALE nuclease architecture for efficient genome editing. Nat Biotechnol. 2011;29:143–8.View ArticleGoogle Scholar
- Kim JS. Genome editing comes of age. Nat Protoc. 2016;11:1573–8.View ArticleGoogle Scholar
- Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C, Orchard R, et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol. 2016;34:184–91.View ArticleGoogle Scholar
- Kleinstiver BP, Pattanayak V, Prew MS, Tsai SQ, Nguyen NT, Zheng Z, Joung JK. High-fidelity CRISPR-Cas9 nucleases with no detectable genome-wide off-target effects. Nature. 2016;529:490–5.View ArticleGoogle Scholar
- Shin J, Jiang F, Liu JJ, Bray NL, Rauch BJ, Baik SH, Nogales E, Bondy-Denomy J, Corn JE, Doudna JA. Disabling Cas9 by an anti-CRISPR DNA mimic. Sci Adv. 2017;3:e1701620.View ArticleGoogle Scholar
- Duan J, Lu G, Xie Z, Lou M, Luo J, Guo L, Zhang Y. Genome-wide identification of CRISPR/Cas9 off-targets in human genome. Cell Res. 2014;24:1009–12.View ArticleGoogle Scholar
- Kim D, Kim S, Kim S, Park J, Kim JS. Genome-wide target specificities of CRISPR-Cas9 nucleases revealed by multiplex Digenome-seq. Genome Res. 2016;26:406–15.View ArticleGoogle Scholar
- Schaefer KA, Wu WH, Colgan DF, Tsang SH, Bassuk AG, Mahajan VB. Unexpected mutations after CRISPR-Cas9 editing in vivo. Nat Methods. 2017;14:547–8.View ArticleGoogle Scholar
- Xu P, Tong Y, Liu XZ, Wang TT, Cheng L, Wang BY, Lv X, Huang Y, Liu DP. Both TALENs and CRISPR/Cas9 directly target the HBB IVS2-654 (C > T) mutation in beta-thalassemia-derived iPSCs. Sci Rep. 2015;5:12065.View ArticleGoogle Scholar
- Kosicki M, Tomberg K, Bradley A. Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements. Nat Biotechnol. 2018;36:765–71.View ArticleGoogle Scholar
- Li C, Qi R, Singleterry R, Hyle J, Balch A, Li X, Sublett J, Berns H, Valentine M, Valentine V, Sherr CJ. Simultaneous gene editing by injection of mRNAs encoding transcription activator-like effector nucleases into mouse zygotes. Mol Cell Biol. 2014;34:1649–58.View ArticleGoogle Scholar
- Sung YH, Baek IJ, Kim DH, Jeon J, Lee J, Lee K, Jeong D, Kim JS, Lee HW. Knockout mice created by TALEN-mediated gene targeting. Nat Biotechnol. 2013;31:23–4.View ArticleGoogle Scholar
- Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics. 2013;43:11.10.11–33.Google Scholar
- Yu YE, Zhang Y, Unni E, Shirley CR, Deng JM, Russell LD, Weil MM, Behringer RR, Meistrich ML. Abnormal spermatogenesis and reduced fertility in transition nuclear protein 1-deficient mice. Proc Natl Acad Sci U S A. 2000;97:4683–8.View ArticleGoogle Scholar
- Montellier E, Boussouar F, Rousseaux S, Zhang K, Buchou T, Fenaille F, Shiota H, Debernardi A, Hery P, Curtet S, et al. Chromatin-to-nucleoprotamine transition is controlled by the histone H2B variant TH2B. Genes Dev. 2013;27:1680–92.View ArticleGoogle Scholar
- de Smith AJ, Walters RG, Froguel P, Blakemore AI. Human genes involved in copy number variation: mechanisms of origin, functional effects and implications for disease. Cytogenet Genome Res. 2008;123:17–26.View ArticleGoogle Scholar
- Lieber MR, Gu J, Lu H, Shimazaki N, Tsai AG. Nonhomologous DNA end joining (NHEJ) and chromosomal translocations in humans. Subcell Biochem. 2010;50:279–96.View ArticleGoogle Scholar
- Miller JC, Zhang L, Xia DF, Campo JJ, Ankoudinova IV, Guschin DY, Babiarz JE, Meng X, Hinkley SJ, Lam SC, et al. Improved specificity of TALE-based genome editing using an expanded RVD repertoire. Nat Methods. 2015;12:465–71.View ArticleGoogle Scholar
- Andrews TD, Whittle B, Field MA, Balakishnan B, Zhang Y, Shao Y, Cho V, Kirk M, Singh M, Xia Y, et al. Massively parallel sequencing of the mouse exome to accurately identify rare, induced mutations: an immediate source for thousands of new mouse models. Open Biol. 2012;2:120061.View ArticleGoogle Scholar
- Field MA, Cho V, Andrews TD, Goodnow CC. Reliably detecting clinically important variants requires both combined variant calls and optimized filtering strategies. PLoS One. 2015;10:e0143199.View ArticleGoogle Scholar
- Li H, Durbin R. Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics. 2009;25:1754–60.View ArticleGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. Genome project data processing S: the sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.View ArticleGoogle Scholar
- Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27:2987–93.View ArticleGoogle Scholar
- McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F. Deriving the consequences of genomic variants with the Ensembl API and SNP effect predictor. Bioinformatics. 2010;26:2069–70.View ArticleGoogle Scholar
- Ye K, Schulz MH, Long Q, Apweiler R, Ning Z. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics. 2009;25:2865–71.View ArticleGoogle Scholar
- Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res. 2002;12:656–64.View ArticleGoogle Scholar
- Buchou T, Tan M, Barral S, Vitte AL, Rousseaux S, Arechaga J, Khochbin S. Purification and analysis of male germ cells from adult mouse testis. Methods Mol Biol. 2017;1510:159–68.View ArticleGoogle Scholar
- Domaschenz R, Kurscheid S, Nekrasov M, Han S, Tremethick DJ. The histone variant H2A.Z is a master regulator of the epithelial-mesenchymal transition. Cell Rep. 2017;21:943–52.View ArticleGoogle Scholar
- Trincado JL, Entizne JC, Hysenaj G, Singh B, Skalic M, Elliott DJ, Eyras E. SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions. Genome Biol. 2018;19:40.View ArticleGoogle Scholar
- Diana Anuar N, Kurscheid S, Field M, Zhang L, Rebar E, Gregory P, Buchou T, Bowles J, Koopman P,. Tremethick D.J. and Tatiana Soboleva T: Gene editing of the multi-copy H2A.B gene and its importance for fertility. https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-7277/.