An ENU mutagenesis screen identifies novel and known genes involved in epigenetic processes in the mouse

Background We have used a sensitized ENU mutagenesis screen to produce mouse lines that carry mutations in genes required for epigenetic regulation. We call these lines Modifiers of murine metastable epialleles (Mommes). Results We report a basic molecular and phenotypic characterization for twenty of the Momme mouse lines, and in each case we also identify the causative mutation. Three of the lines carry a mutation in a novel epigenetic modifier, Rearranged L-myc fusion (Rlf), and one gene, Rap-interacting factor 1 (Rif1), has not previously been reported to be involved in transcriptional regulation in mammals. Many of the other lines are novel alleles of known epigenetic regulators. For two genes, Rlf and Widely-interspaced zinc finger (Wiz), we describe the first mouse mutants. All of the Momme mutants show some degree of homozygous embryonic lethality, emphasizing the importance of epigenetic processes. The penetrance of lethality is incomplete in a number of cases. Similarly, abnormalities in phenotype seen in the heterozygous individuals of some lines occur with incomplete penetrance. Conclusions Recent advances in sequencing enhance the power of sensitized mutagenesis screens to identify the function of previously uncharacterized factors and to discover additional functions for previously characterized proteins. The observation of incomplete penetrance of phenotypes in these inbred mutant mice, at various stages of development, is of interest. Overall, the Momme collection of mouse mutants provides a valuable resource for researchers across many disciplines.

Conclusions: Recent advances in sequencing enhance the power of sensitized mutagenesis screens to identify the function of previously uncharacterized factors and to discover additional functions for previously characterized proteins. The observation of incomplete penetrance of phenotypes in these inbred mutant mice, at various stages of development, is of interest. Overall, the Momme collection of mouse mutants provides a valuable resource for researchers across many disciplines.

Background
Mutagenesis screens for modifiers of position effect variegation in Drosophila have played a defining role in the development of the field of epigenetics [1]. The screens used a fly strain that showed variegated expression of the white (w) locus, resulting in red and white patches in the eye, as a result of the stochastic establishment of epigenetic state. The genes identified by these screens turn out to have pivotal roles in gene silencing [2][3][4]. We have designed a similar screen in the mouse, using a green fluorescent protein (GFP) transgene that shows variegated expression in red blood cells. Offspring of N-ethyl-N-nitrosourea (ENU)-treated males are screened for changes in the percentage of erythrocytes expressing GFP; this is a screen for dominant effects. Each mutant line is named a Modifier of murine metastable epiallele Dominant, MommeD [5]. The underlying mutations have been identified and published for nine lines and the mutations occur in DNA methyltransferases (Dnmt1 and Dnmt3b), chromatin remodelers (Smarca5 and Baz1b), a histone deacetylase (Hdac1), a transcriptional co-repressor (Trim28), a eukaryotic translation initiation factor (eIF3h) [6][7][8][9][10] and a previously unknown gene, Smchd1, now shown to be required for X-inactivation in the mouse [11,12]. Recently, Smchd1 has been shown to act as a tumor suppressor [13] and mutations in SMCHD1 have been shown to be tightly associated with the human disease facioscapulohumeral dystrophy type 2 (FSHD2) [14].

Identification of Momme mutants
We have now screened a total of approximately 5,000 G1 offspring, recovered 42 MommeD lines and identified the underlying mutations in 29 cases. These lines carry mutations in 18 unique genes (Table 1). A number of the genes have been hit more than once. Based on their effect on GFP expression, 14 lines were classified as suppressors of variegation, that is, the mutation increased the percentage of erythrocytes expressing GFP, and 15 lines were classified as enhancers of variegation, that is, the mutation decreased the percentage of erythrocytes expressing GFP (Table 1). Here we report, for the first time, the underlying mutations in 20 MommeD lines, which carry mutations in 10 unique genes.
The experimental pipeline for the screen is shown in Figure 1. In brief, ENU mutagenesis was carried out as described previously [5] and mapping was carried out following a G2 backcross to Line3C (a C57BL/6J strain carrying the same GFP transgene array at the same location) using traditional microsatellite and SNP genotyping or an Illumina GoldenGate SNP genotyping assay. Mapping intervals for the latter group are shown as Manhattan plots in Additional file 1.
To identify the underlying mutations, several different strategies have been used. In ten cases, the mutation was identified following whole exome deep sequencing of DNA from one mutant mouse from the MommeD inbred colony. For these, variants were called within the intervals linked to the causative mutation. For some lines, variant calling was extended to exome-wide, to assess for passenger mutations but only a small number was found (<10). These did not reflect the typical ENU sequence bias (T/A) [30] and probably represent normal background mutations.
In two instances, we have used a custom capture array to specifically enrich for DNA from a 4.2 Mbp interval previously identified by linkage analysis [7] followed by deep sequencing to identify the underlying mutations. In the remaining eight cases, Sanger sequencing of the exons (including splice sites) of candidate genes in the linked intervals was carried out. Details on identification of the mutations are provided in Materials and methods. In all cases, putative mutations were verified by PCR and Sanger sequencing in larger cohorts (at least 100 mice per mutant line).
In the two cases in which a custom capture array was used, we had an opportunity to obtain an estimate of the ENU mutation rate. In one case four mutations were found in the 4.2 Mbp captured interval and in the other case six mutations were found. Based on these results we estimated the ENU mutation rate at approximately 1 per Mbp. This is consistent with reports by others [31].
The heritability of the mutations was tested over at least six generations. Flow cytometric expression profiles and the percentage of erythrocytes expressing the transgene for each MommeD (presented here for the first time) are shown in Additional file 2.
MommeD30 mice are haploinsufficient for Widely interspaced zinc finger (Wiz) MommeD30 was classified as an enhancer of variegation. We mapped the MommeD30 mutation to a 1.9 Mbp interval on chromosome 17 encompassing 48 genes (Additional file 3). Exome deep sequencing identified a single base deletion in exon 5 of the Wiz gene, causing a frame shift mutation that is predicted to introduce a premature stop codon ( Figure 2a and Table 1). No other putative ENU variants were identified in the linked region.
Wiz contains six Kruppel-type zinc finger motifs in a widely interspaced manner and has been reported to associate with the histone H3 lysine 9 (H3K9) methyltransferases G9a and GLP in cell lines [32,33]. No mouse mutants for Wiz have so far been described. Western blot analysis using an anti-Wiz antibody detected two bands (about 120 and 130 kDa), as expected, in wild-type embryos (Figure 2a). Embryos heterozygous for the MommeD30 mutation showed approximately half the amount of Wiz protein and no Wiz protein was detected in homozygotes ( Figure 2a). We conclude that MommeD30 mice are haploinsufficient for Wiz. We refer to this allele as Wiz MommeD30 .
MommeD18 carries a mutation in Rap interacting factor 1 (Rif1) MommeD18 was classified as a suppressor of variegation and candidate gene sequencing revealed that MommeD18 mice carry a nonsense mutation in exon 29 of the Rif1 gene ( Figure 2b and Table 1). Two other candidate genes (Epc2 and Mbd5) were sequenced and no mutations were identified in these genes. Western blot analysis showed approximately half the amount of full-length Rif1 protein in MommeD18 heterozygotes and no full-length Rif1 protein in a homozygote ( Figure 2b). Furthermore, a Rif1 gene trap allele (Materials and methods) had a similar effect on transgene expression as that observed with the MommeD18 mutation, increasing the percentage of expressing cells in mice heterozygous for the gene-trap allele (Additional file 4).
We conclude that the MommeD18 allele is the result of a mutation in Rif1 that we refer to as Rif1 MommeD18 . Rif1 has recently been shown to be involved in DNA double strand break repair [34][35][36][37] and replication  [29] In some cases the mutation can be defined as hypomorphic or null and in other cases this designation is less certain and we have added a question mark.
timing in mammals [38,39], both processes involving significant chromatin reorganization. Our findings suggest a role for Rif1 in transcription.
MommeD8, MommeD28 and MommeD34 carry mutations in Rearranged L-myc fusion (Rlf), a gene about which little is known Three enhancers of variegation, MommeD8, MommeD28 and MommeD34, were found to carry mutations in Rlf.
Whole interval capture revealed that MommeD8 and MommeD34 mice have mutations in exon 8 and exon 7, respectively. MommeD8 is a missense mutation and MommeD34 is a nonsense mutation ( Figure 2c and Table 1). The MommeD28 mutation was identified by candidate sequencing and affects the splice acceptor site of intron 4 ( Figure 2c and Table 1). Western blot analysis using two independent anti-Rlf antibodies (one polyclonal and one monoclonal) detected an approximately 280 kDa band in protein lysates made from wild-type embryos ( Figure 2d). Although the band detected on the western blots is substantially higher than the predicted molecular weight of 218 kDa (NCBI Mouse Build 37), this could be due to post-translational modifications to the Rlf protein.
We concluded that it is likely that this band is Rlf. Reduced amounts (MommeD8) or no (MommeD28 and MommeD34) Rlf protein was detected in homozygous embryos ( Figure 2d). Together, these results suggest that MommeD8 is a hypomorphic allele and that MommeD28 and MommeD34 are null alleles.
Rlf encodes a protein predicted to contain 16 widely spaced zinc finger domains, which has led to the suggestion that it has a role in transcriptional regulation [40]. Rlf is conserved across vertebrates. To determine the localization of Rlf in the cell, we carried out cell fractionation experiments. Rlf protein was specifically detected in the nuclear fraction of HeLa cells (Figure 2e). Figure 1 An ENU mutagenesis/whole exome deep sequencing pipeline enables rapid identification of causative mutations in mice with defects in epigenetic gene silencing. A schematic overview of the ENU mutagenesis gene discovery pipeline is presented. The major components of the screen are described in the figure. Briefly, male FVB/NJ mice carrying an epigenetically sensitive GFP transgene (Line3) were treated with ENU and mated with female Line3 mice. Offspring were screened for a shift in the percentage of GFP expressing erythrocytes using flow cytometry. Putative mutants were mated with Line3 mice for four generations to test for heritability and reproducibility of the GFP expression, and to reduce the number of non-causative ENU mutations within the genomes. Linkage analysis was carried out on the offspring from two generations of backcrosses between putative mutants and C57BL/6J mice, which also carry the GFP transgene (Line3C). Causative mutations were identified by whole exome deep sequencing, or gene candidate sequencing on individuals that had been maintained on the Line3 background for at least seven generations. FACS, fluorescence-activated cell sorting.

MommeD30
Frameshift at aa 553  To our knowledge, we have identified the first mouse mutants for Rlf and hereafter refer to these alleles as Rlf MommeD8 , Rlf MommeD28 and Rlf MommeD34 . We have identified Rlf from a screen that is based on the expression of a multi-copy variegating transgene. To test if Rlf is required for the establishment of epigenetic state at an endogenous locus, we studied the effect of Rlf MommeD8 on expression of agouti viable yellow (A vy ). A vy is a single copy gene that, like the transgene, is known to be sensitive to epigenetic state [41]. The A vy allele is the result of an intracisternal-A-particle (IAP) retrotransposon insertion upstream of the agouti gene and isogenic mice carrying the A vy allele can be yellow, mottled or pseudoagouti (dark brown). Coat color has been shown to correlate with the level of DNA methylation at the long-terminal repeat (LTR) promoter [42]. It has been shown that haploinsufficiency for modifiers of epigenetic reprogramming can alter the ratio of the different coat colors [5,6,43,44]. We crossed females heterozygous for Rlf MommeD8 with pseudoagouti A vy /a mice and scored the offspring for coat color. Offspring that inherited the Rlf MommeD8 allele were more likely to be pseudoagouti than their wild-type littermates, that is, haploinsufficiency for Rlf increased the probability of silencing at the A vy locus (Figure 2f ). This is consistent with Rlf being an enhancer of variegation and suggests that Rlf has a general role in epigenetic regulation.

Novel alleles of known epigenetic regulators
In addition to the genes described above, we have produced 15 MommeD lines that carry mutations in genes already known to be involved in epigenetic regulation in the mouse. Exome deep sequencing of three suppressors of variegation, MommeD19, MommeD27 and MommeD39, identified mutations in the chromatin remodelers SWI/ SNF related, matrix associated, actin dependent regulator of chromatin, subfamily c, member 1 (Smarcc1), Polybromo 1 (Pbrm1) and SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 4 (Smarca4), respectively (Table 1). Smarcc1 MommeD19 carries a mutation in the splice acceptor site of intron 10 predicted to lead to an in-frame premature stop codon and likely nonsense-mediated mRNA decay. Quantitative real-time RT-PCR and western blot analysis showed that Smarcc1 mRNA and Smarcc1 protein levels were reduced in heterozygous animals ( Figure 3a). The Pbrm1 MommeD27 mutation is a missense mutation in exon 17. Western blot analysis revealed reduced levels of Pbrm1 protein in homozygotes ( Figure 3b). Smarca4 MommeD39 has a mutation in the splice donor site of intron 20. cDNA sequencing revealed an extended exonic sequence (data not shown) that is predicted to lead to an in-frame premature stop codon and nonsense-mediated mRNA decay. Quantitative real-time RT-PCR analysis showed reduced Smarca4 mRNA in heterozygotes ( Figure 3c).
MommeD42 was identified as an enhancer of variegation and carries a nonsense mutation in exon 11 of Bromodomain containing 1 (Brd1; Figure 3d and Table 1). Brd1 has recently been reported to form a complex with the histone acetyltransferase HBO1 and is required for the transcriptional activation of the erythroid-specific regulator genes in fetal liver [45].
Three suppressors of variegation were found to carry mutations in H3K9 methyltransferases. MommeD13 and MommeD17 carry mutations in Setdb1 and MommeD33 has a mutation in Suv39h1 (Table 1). The Setdb1 MommeD13 mutation introduces a novel splice donor site leading to an in-frame premature stop codon in exon 20 ( Figure 3e). cDNA analysis confirmed the presence of the predicted transcript (data not shown). Setdb1 MommeD17 is a missense mutation in exon 21 of Setdb1 ( Figure 3e). RNA analysis indicates reduced levels of Setdb1 mRNA in the MommeD13 mutants whereas Setdb1 mRNA levels were not significantly different in the MommeD17 mutants ( Figure 3e). The Suv39h1 MommeD33 mutation is at the splice donor side of intron 20. This mutation is predicted to lead to a premature stop codon. Consistent with this, western blot analysis showed that the level of Suv39h1

% Expression
(relative to +/+)  protein was greatly reduced in hemizygous mutant males (Suv39h1 is on the X chromosome; Figure 3f). MommeD40, a suppressor of variegation, carries a nonsense mutation in exon 17 of the Uhrf1 gene ( Figure 3g and Table 1). Western blot analysis revealed greatly reduced levels of Uhrf1 protein in homozygous embryos (Figure 3g). Uhrf1 is known to associate with the maintenance DNA methyltransferase Dnmt1 [46] and has recently been shown to link DNA methylation and H3K9 methylation in human cells [47].
Approximately 60 Suppressor of variegation (Su(var)) and 25 Enhancer of variegation (E(var)) genes were identified in the Drosophila screens [2]. We have now documented 12 Su(var) and 8 E(var) genes in the mouse and these encode some of the key factors for DNA methylation and H3K9 methylation, as well as a large group of chromatin remodelers (Figure 4). Some of these genes, such as Dnmt1, Suv39h and Trim28, have been identified as core heterochromatin factors, whereas others, such as Smarca5 and Hdac1, have been associated with gene silencing at euchromatic positions [2]. This is consistent with the variegating phenotype of the transgene. Interestingly, we have not yet recovered any members of the RNA interference/piwi-interacting RNA (piRNA) pathway or Polycomb and trithorax groups of proteins but the screen has not reached saturation.
Momme genes are required for normal embryonic development In most of the MommeD lines, heterozygous mice were observed at expected ratios at weaning and did not show any obvious phenotypes. To determine the viability of homozygotes, we performed embryonic dissections following intercrosses. We have focused on those lines carrying mutations in genes about which there is no or few data available in the literature.
Following Wiz MommeD30 intercrosses, no homozygous animals were recovered at weaning out of 97 progeny ( Figure 5a). To determine when Wiz MommeD30 homozygotes died, embryos from intercross matings were obtained at different stages of development and genotyped. Viable homozygotes were recovered at the expected ratios at 10.5 days post-coitum (dpc). However, these embryos appeared smaller than their wild-type and heterozygous littermates. At 12.5 dpc homozygotes were present in numbers less than expected (15% as opposed to 25%) and at 14.5 dpc no viable homozygous embryos were obtained (Figure 5a). We conclude from these data that embryonic death in mice homozygous for Wiz MommeD30 occurs between 10.5 and 12.5 dpc.
Rif1 MommeD18 intercrosses produced wild-type and heterozygous offspring at the expected ratios and occasional homozygous individuals (3 out of 155 offspring) at weaning (Figure 5b). All surviving homozygotes were males and two, out of three tested, were infertile. Timed matings revealed that the majority of the homozygotes died during the second half of gestation (Figure 5b). This is consistent with the literature [48].
Rlf intercrosses produced homozygous individuals that survived to weaning in less than expected numbers for Rlf MommeD8 , Rlf MommeD28 and Rlf MommeD34 (Figure 5c). Rlf MommeD34 homozygotes died in the first week after birth, for unknown reasons (Figure 5c). A similar finding was made for Rlf MommeD28 homozygotes; most died in the first week after birth (data not shown). However, in both lines some homozygotes did survive to adulthood and were fertile (data not shown). Embryonic dissections indicated that, for the Rlf MommeD28 and Rlf MommeD34 lines, late gestation homozygous embryos weighed significantly less than their heterozygous or wild-type littermates (T-test, P < 0.05; Figure 5c). Rlf MommeD8 behaves like a hypomorphic allele, with approximately half the expected number of homozygotes present at weaning (Figure 5c). Those surviving were smaller and were fertile (data not shown).
One advantage of ENU mutagenesis is that it produces point mutations, increasing the likelihood of hypomorphic mutations. We have reported hypomorphic alleles for Smarca5 MommeD4 and Dnmt3b MommeD14 previously [6,10]. In addition to the Rlf MommeD8 allele described above, Setdb MommeD17 behaves like a hypomorphic allele for the following reasons. Following intercrosses, some Setdb1 MommeD17 homozygotes were present at weaning (Figure 5d) and some survived to adulthood but were smaller than their wild-type and heterozygous littermates and showed decreased fertility (data not shown). Mice homozygous for a null allele of Setdb1 have been reported to die at the early post-implantation stage [49]. Consistent with this, timed matings with Setdb1 MommeD13 heterozygous mice produced no viable homozygous offspring at 9.5 dpc (Figure 5d). To our knowledge, Setdb1 MommeD17 is the first hypomorphic allele for Setdb1.
In two cases, Smarcc1 MommeD19 and Smarca4 MommeD39 , heterozygous mutants showed signs of stochastic death prior to weaning. Following Smarcc1 MommeD19 intercrosses, no viable homozygotes were obtained at weaning and timed matings revealed that homozygous lethality occurred prior to mid-gestation, as expected based on the literature [50] (Figure 5e). At weaning the percentage of heterozygotes to wild types (47% and 53%, respectively) was less than that expected (66% and 33%, respectively). Maintenance of the colony, which involved heterozygous to wild type crosses, also produced fewer mutants than expected at weaning (34% as opposed to 50%) (Figure 5e). Further investigation suggested that this stochastic death of heterozygotes was occurring after 17.5 dpc and before 1 week postnatal (Figure 5e). At 17.5 dpc, some heterozygotes had exencephaly, although this could not account for all the perinatal lethality observed (Figure 5e). The average weight of heterozygous mice at E17.5 and at 1 week postnatal was reduced (T-test, P < 0.0001) and greater variance (F-test, P < 0.05) was observed among heterozygotes than among wild-type littermates (Figure 5e).
Maintenance of the Smarca4 MommeD39 colony (heterozygote to wild type crosses) also revealed non-Mendelian ratios; heterozygotes were present in less than expected numbers at weaning (18% as opposed to 50%) and were smaller than their wild-type littermates (Figure 5f ). Timed matings showed that the ratios were close to Mendelian prior to birth (Figure 5f ). Body weights of heterozygotes were lower than those of wildtype littermates at weaning (T-test, P < 0.0001) and, again, the variance was greater in the heterozygotes (F-test, P < 0.0001) (Figure 5f ). Similar stochastic death of heterozygotes has been described by others for mice with knock-out alleles for Smarcc1 and Smarca4 [50,51]. In these cases, however, the genetic heterogeneity among the mice could have been an underlying factor.
We have previously reported increased variance in body weight and behavioral responses in adult mice heterozygous for Trim28 [8] and others have made similar findings with respect to Trim28 during the oocyte to embryo transition [52]. We have also reported that reduced levels of Dnmt3a are associated with an increase in body weight variance in adults [8]. These observations, in the context of inbred strains reared in controlled environments, are consistent with a role for the genes identified in this screen in canalization, a term coined by Waddington to describe phenotypic robustness during development. Exactly how important probabilistic developmental events are in determining     phenotype is not yet clear [53]. We have so far reported only three measures; the penetrance of homozygous or heterozygous death in the colony, the body weight of heterozygotes or homozygotes (at various ages) in the colony and behavioral responses. More extensive phenotyping of these inbred MommeD lines should enable us to gain a better understanding of this phenomenon.
In the remaining MommeD lines examined, MommeD40 and Brd1 MommeD41 , we observed homozygous embryonic death that is similar to that reported for knock-out alleles of these genes (Additional file 6).

DNA methylation levels at the transgene are increased in Rlf mutants
Changes in transcription can correlate with changes in DNA methylation at transgenes and metastable epialleles [42,[54][55][56]. We have shown previously that changes in the percentage of red blood cells expressing GFP can be accompanied by changes in DNA methylation at the transgene [7,10]. Using bisulfite sequencing, we investigated DNA methylation at the HS-40 enhancer region of the transgene in MommeD lines. We used adult spleens from wild-type mice and mice heterozygous for a MommeD mutation. Consistent with our previous reports, in Line3 wild-type mice around 60% of the CpGs in the HS-40 element were methylated (Figure 6a). Mice heterozygous for Dnmt1 MommeD32 suggested a decrease in CpG methylation (around 50%; Figure 6a). This is consistent with an increase in expression of the GFP transgene in these mutants and with its role as the maintenance DNA methyltransferase. In mice heterozygous for the Wiz and Rif1 mutations DNA methylation patterns at the HS-40 region were unaffected (Figure 6b).
In the case of the Rlf alleles, we detected higher levels of DNA methylation at the HS-40 element in adult spleens in Rlf MommeD8 (82%, T-test <0.05) and Rlf MommeD34 (81%, T-test <0.05) homozygotes compared to wild types (67%) (Figure 6d). A similar trend (not statistically significant) was observed in the Rlf MommeD8 and the Rlf MommeD34 heterozygotes compared with that seen in Line3 wild-type (Figure 6a,c). In the case of Rlf MommeD28 , where adult homozygotes are extremely rare, we analyzed DNA methylation patterns in 16.5 dpc embryos and made a similar finding. The HS-40 element was more methylated in Rlf MommeD28 homozygotes (79%, T-test <0.05) than it was in wild-type controls (51%) (Figure 6d). This is the first time that we have observed hypermethylation of the transgene in a MommeD line. An increase in DNA methylation at the transgene locus in the Rlf mutants is consistent with the decreased expression of the transgene in these mice.

Conclusions
Our results suggest that transgene silencing in the mouse acts through a mechanism common to transposon silencing, X-inactivation and imprinting. We provide evidence that a novel gene, Rlf, is involved in this process. Depletion of Rlf leads to DNA hypermethylation at the transgene. How this is achieved is currently unknown and requires further investigation. We have isolated the first mouse mutants for Rlf and Wiz and shown that the genes are required for normal embryonic development. Notably, the human homologs of many of the genes recovered from our screen have been found to be associated with human diseases, in some cases identified in family studies and in other cases suggested by genome-wide association studies (Table 1).
Since the mouse is often used as a disease model, we anticipate that our collection of Momme mutants will provide a valuable resource for researchers across many disciplines. . Embryonic weights were measured from intercrosses of Rlf MommeD34 and Rlf MommeD28 mice. Homozygous embryos from both had a significant increase in weight variation at 17.5 dpc or 16.5 dpc, respectively. Weights for each litter were normalized to the average weight of wild-type embryos in that litter. Each data point represents an individual. Homozygotes were smaller than wild-type littermates at one week after birth. ns, not significant. (d) Setdb1 MommeD13 and Setdb1 MommeD17 mice; data show the number of mice observed (and in brackets the percentage). (e) Smarcc1 MommeD19 and Smarcc1 MommeD19 mated to wild-type mice revealed incomplete penetrance of heterozygous lethality after birth. Data show the number of mice (and in brackets the percentage). Weights of 17.5 dpc Smarcc1 MommeD19 embryos and pups were, on average, less than that of wild-type littermates (T-test, P < 0.0001) and showed greater variation (F-test, P < 0.05). Weights in each litter were normalized to the average weight of wild-type embryos in that litter. Each data point represents an individual. (f) Smarca4 MommeD39 heterozygotes showed reduced viability. Data show the number of mice and in brackets the percentage. Heterozygotes were smaller than their wild-type littermates. Smarc4 MommeD39/+ weighed less (T-test, P < 0.0001) and weights were more variable (F-test, P<0.0001) than wild types. Weights in each litter were normalized to the average weight of wild-type embryos in that litter. Each data point represents an individual.

Generation and screening of mutant mice
Procedures were approved by the Animal Ethics Committee of the QIMR Berghofer Medical Research Institute. The ENU screen was carried out in the FVB/NJ inbred transgenic line, Line3, which is homozygous for a multicopy GFP transgene, as described previously [5]. All MommeD lines were maintained in this background and are homozygous for the GFP transgene. All experimental data: crosses to other lines, intercrosses, and so on, were carried out using heterozygous MommeD mice five generations or more removed from the MommeD founder.

Other mouse strains
The congenic strain, Line3C, used for linkage studies, was produced by crossing Line3 to C57BL/6J for 10 generations, selecting for mice carrying the transgene by flow cytometry. Inbred C57BL/6J mice were purchased from ARC Perth (Perth, WA, Australia). Rif1 GT mice were generated by the Australian Phenomics Network from an embryonic stem cell clone carrying a trapped Rif1 allele (A045A01; German Gene Trap Consortium), crossed to Line3C (C57BL/6J) and maintained on this background.

Flow cytometry
Mice were analyzed by flow cytometry at 3 weeks of age. A drop of blood was collected in Osmosol buffer and analyzed on a Guava easyCyte HT (Merck/Millipore, Darmstadt, Germany). The data were analyzed by using Guava InCyte software with a GFP-positive gate set to exclude 99.9% of wild-type erythrocytes. Histograms shown depict only the GFP fluorescence channel.

Linkage analysis
Heterozygous mutant MommeD mice, at least four generations down from the founder, were backcrossed twice to Line3C (see above) and phenotyped for GFP expression by flow cytometry. DNA from tail tips was used to perform linkage analysis.  Three mice were sequenced; a wild type, a heterozygote and a homozygote. For MommeD34, a heterozygous mutant was captured and sequenced in-house using the same array capture design on 2.1M arrays with HX3 mixers, according to the Roche NimbleGen Arrays User's Guide, Illumina Optimized protocol (version 1.0).

SNP calling
The sequencing reads from the targeted sequence capture experiments were aligned to the mouse genome (build 37, mm9) using the program bwa version 0.6.1 [57]. Datasets generated by Roche on the 454 platform were mapped using bwa bwasw [58] while the remaining datasets, generated on the Illumina HiSeq or GAIIx platforms, were mapped using bwa aln, with default settings, followed by bwa sampe with the default settings. The resulting sam files were converted to bam files and coordinate-sorted using SAMtools version 0.1.17 [59] and PCR duplicates were subsequently eliminated using the program Picard MarkDuplicates, version 1.48 [60].
For each sample, nucleotide variants were identified within the intervals for which linkage had previously been determined. This was achieved by creating a pileup file of the linked region using SAMtools mpileup, using the option -q 20, followed by variant calling using the program Varscan version v2.2.8 [61] using the 'somatic' feature and the settings -min-coverage 15 and -min-var-freq 0.3. Varscan somatic calls sequence differences between a case and a control sample; as control the sequenced exome from a different ENU mutant was used. The case and control exomes were constructed using the same library preparation methods and sequenced in the same deep sequencing run, but had different linked intervals. All exomes were mapped and processed in parallel, using identical settings, to minimize post-sequencing artifacts. The output from Varscan was manually screened for likely ENU mutations, appearing as heterozygous SNPs in the mutant and wild type in the control. These SNPs were in turn validated using the Sanger method.
The custom capture and sequencing of the MommeD34 linked region was carried out in-house without a matched wild-type control; instead, a merge of the three MommeD8 deep sequencing samples previously sequenced by Roche was used as control. A merge of these 454 datasets was used in order to achieve greater read depth across the region. SNP calling was subsequently carried out as described above. Exome sequencing datasets generated in this study are accessible via European Nucleotide Archive (ENA) under accession ERP003831.

Genotyping
Once the mutations had been identified, genotyping was carried out by either sequencing (MommeD13

Embryo dissections
All embryos were produced by natural matings and detection of a vaginal plug was counted as 0.5 dpc. Except where otherwise stated, embryos were produced by intercrosses.
Introduction of the mutant lines into mice carrying the A vy allele FVB/NJ mice heterozygous for the MommeD8 mutation and homozygous for the GFP transgene were mated with C57BL/6J mice heterozygous for the A vy allele. The coat color phenotype was classified at weaning by a trained observer as either yellow, mottled, or agouti. GFP expression was determined by flow cytometry and used to classify the mice into mutants or wild-type for the MommeD8 mutation. FVB/NJ mice carry the A locus and C57BL/6J mice carry the a locus. Yellow and mottled offspring carry the A vy allele. All agouti-colored offspring were genotyped by PCR, to assess whether they were A vy /A and pseudoagouti, or A/a, as reported [56].

Bisulfite sequencing of the transgene HS-40 enhancer
Bisulfite conversion of DNA was carried out using the EpiTect Bisulfite Kit (Qiagen, Doncaster, VIC, Australia) according to the manufacturer's instructions. At least two male adult spleens were used for each MommeD line. The bisulfite conversion rate was at least 97% and sequences were analyzed using the BiQ Analyser software [62]. Oligonucleotides to the bisulfite converted HS-40 enhancer region were as follows (5'-3'): GFPbisF1: AAAATAAAA TTTTTGGATTGTTATTATTATAA; GFPbisF2: ATATT TGTAATTTTAGTATTTTGGGAGGTT and GFPbisR: A ATCTCTACTCACTACAAACTCCATCTC. Cycling conditions were as follows: 94°C for 2 minutes for 1 cycle; 94°C for 30 seconds, 60°C for 30 seconds, 72°C for 45 seconds for 35 cycles and 72°C for 6 minutes for 1 cycle.

RNA isolation, cDNA analysis and quantitative real-time RT-PCR
Total RNA was extracted from various tissues using TRI reagent (Invitrogen). cDNA was synthesized from total RNA using SuperScriptIII reverse transcriptase (Invitrogen) or AMV reverse transcriptase (Roche) and random hexamer primers. Quantitative real-time PCR was performed with the Platinum SYBR Green qPCR Super Mix -UDG (Invitrogen) with primers designed to span exon/intron boundaries. All reactions were performed in triplicates and normalized to Hprt or Gapdh. PCRs were run on a Viia7 (Applied Biosystems, Mulgrave, VIC, Australia) or on a Corbett Research Rotor-Gene (Qiagen). Cryp-Skip [63] was used for splice site prediction. cDNA from mutant alleles was sequenced using Sanger sequencing. Primer sequences are provided in Additional file 8.

Statistical analysis
Statistical significance of quantitative data was determined by two-tailed Student's T-test. F-test was used to test whether variance was significantly different between wild-type and mutant groups. The proportions of genotypes were compared to expected Mendelian ratios using a χ 2 test. For all datasets a minimum of three biological replicates were analyzed.

Additional files
Additional file 1: Linked intervals. Manhattan plots showing linked intervals identified by Illumina GoldenGate SNP genotyping analysis. The x-axis represents the chromosomes and the y-axis is the LOD score. Peaks with a LOD score of 3 or higher are considered significant.
Additional file 2: Flow cytometric profiles of the MommeDs. (a) Representative flow cytometry profiles show the percentage of GFPexpressing erythrocytes from wild-type and heterozygous mutant littermates (n = 3). The x-axis represents the erythrocyte fluorescence on a logarithmic scale and the y-axis is the number of cells detected at each fluorescence level. (b) Percentage of GFP-expressing cells in wild-type, heterozygous mutant and homozygous mutant (where viable) mice at three weeks of age (mean ± standard error of the mean).