Assembly of a young vertebrate Y chromosome reveals convergent signatures of sex chromosome evolution

Heteromorphic sex chromosomes have evolved repeatedly across diverse species. Suppression of recombination between X and Y chromosomes leads to rapid degeneration of the Y chromosome. However, these early stages of degeneration are not well understood, as complete Y chromosome sequence assemblies have only been generated across a handful of taxa with ancient sex chromosomes. Here we describe the assembly of the threespine stickleback (Gasterosteus aculeatus) Y chromosome, which is less than 26 million years old. Our previous work identified that the non-recombining region between the X and the Y spans ∼17.5 Mb on the X chromosome. Here, we combined long-read PacBio sequencing with a Hi-C-based proximity guided assembly to generate a 15.87 Mb assembly of the Y chromosome. Our assembly is concordant with cytogenetic maps and Sanger sequences of over 90 Y chromosome clones from a bacterial artificial chromosome (BAC) library. We found three evolutionary strata on the Y chromosome, consistent with the three inversions identified by our previous cytogenetic analyses. The young threespine stickleback Y shows convergence with older sex chromosomes in the retention of haploinsufficient genes and the accumulation of genes with testis-biased expression, many of which are recent duplicates. However, we found no evidence for large amplicons found in other sex chromosome systems. We also report an excellent candidate for the master sex-determination gene: a translocated copy of Amh (Amhy). Together, our work shows that the same evolutionary forces shaping older sex chromosomes can cause remarkably rapid changes in the overall genetic architecture on young Y chromosomes.


71
Sex chromosomes evolve from autosomal ancestors when recombination is suppressed 72 between the homologous pairs (reviewed in Bachtrog 2013). Thus, sex chromosomes are an 73 intriguing region of the genome to understand how mutations and repetitive DNA accumulate in 74 the absence of recombination and how gene content evolves once a chromosome becomes sex-crossover region on the Y chromosome was composed of two differently aged evolutionary strata, similar to mammals. However, all studies in threespine stickleback have relied on mapping shortreads to the reference X chromosome, limiting our understanding to regions conserved between reference assembly. Only 129 contigs partially aligned to the genome (less than 25% of the contig 154 length aligned; 10.15 Mb) and 148 contigs did not align at all to the genome (3.58 Mb). We

158
We targeted Y-linked contigs in the Canu assembly by identifying contigs that shared 159 sequence homology with the reference X chromosome or did not align to the autosomes. In the 160 youngest region of the threespine stickleback sex chromosomes (the previously identified stratum two), the X and Y chromosomes still share considerable sequence homology. However, within 162 this stratum, heterozygosity is even higher than what is observed across the autosomes (White 163 et al. 2015). Based on this divergence, Canu should separate X-and Y-linked contigs during the 164 initial assembly process. Contigs aligned to the X chromosome formed a distribution of sequence 165 identity that was not unimodal, reflecting the presence of both X-and Y-linked contigs 166 (Supplemental Figure 1). Setting a sequence identity threshold of 96% resulted in a set of 114 X-167 linked contigs that totaled 21.26 Mb, compared to the previous 20.62 Mb X chromosome 168 reference assembly. There were 68 putative Y-linked contigs that had a sequence identity less 169 than or equal to 96%, totaling 12.64 Mb. The oldest region of the Y chromosome (stratum 1) 170 contains many regions that have either been deleted or have diverged to such an extent that 171 sequencing reads cannot be mapped to this region (White et al. 2015). Consequently, there may 172 be contigs unique to the Y chromosome that cannot be captured through alignments to the 173 reference X chromosome. To account for these loci, we also included the contigs that only partially 174 aligned to the genome (less than 25% of the contig length aligned; 129 contigs; 10.15 Mb) or did 175 not align at all to the genome (148 contigs; 3.58 Mb) in the set of putative Y-linked contigs (345 total contigs).
We used chromosome conformation capture (Hi-C) sequencing and a proximity-guided 180 method to assemble the set of putative X-and Y-linked contigs into scaffolds. Using the 3D-DNA 181 assembler (Dudchenko et al. 2017), 105 of the 114 X-linked contigs were combined into three 182 main scaffolds that totaled 20.78 Mb. The scaffolds were largely colinear with the reference X 183 chromosome, with scaffolds one and two aligning to the pseudoautosomal region and scaffold 184 three mostly aligning to the remainder of the X chromosome that does not recombine with the Y 185 (Supplemental Figure 2).

186
We assembled the putative Y-linked contigs using the same process. Of the 345 total 187 contigs, 115 were initially combined into a single primary scaffold that totaled 17.15 Mb. We

202
Using the BAC sequences, we were able to identify whether any of the contigs within the scaffold 203 contain collapsed haplotigs between the X and Y chromosome (mosaic contigs should contain contig ordering across the scaffold was verified by BAC contig sequences that spanned gaps in 206 the assembly. We aligned all 101 sequenced BAC contigs to the Y chromosome scaffold and 207 found 92 of the BAC contigs aligned concordantly with the assembly ( Figure 1B). These BACs 208 aligned to 40 of the 70 contigs in the assembly with a high sequence identity (7.72 Mb of non- Figure 1. Hi-C chromosome conformation capture sequencing generated a single Y chromosome scaffold. (A) The contact matrix shows an enrichment of interactions between contigs in close proximity along the diagonal. Contig boundaries in the assembly are denoted by the black triangles along the diagonal. (B) Sanger sequenced BAC inserts that align concordantly throughout the scaffold are shown, with BACs that spanned gaps between contigs in orange, BACs that extended into, but did not span gaps in purple, and BACs that were contained completely within an individual contig in green.
overlapping sequence in the 15.28 Mb assembly aligned concordantly to the BAC contigs). The 210 remaining 9 BAC contigs that did not align concordantly indicate there are small-scale structural 211 differences between the Canu Y chromosome assembly and the BAC clones derived from a 212 separate Paxton Lake male threespine stickleback, either reflecting errors in the Y chromosome 213 assembly, rearrangements in the BAC clone sequences, or true polymorphisms segregating in 214 the Paxton Lake benthic population. Four of the discordant BACs aligned to regions of the 215 reference Y that were greater than the Sanger sequenced length of the BAC insert, suggesting 216 possible indels. The remaining five discordant BACs contained sub alignments with mixed 217 orientations, suggesting possible small-scale inversions not present in our assembly.

218
Among the aligned BAC contigs, many provided additional sequence information, either 219 spanning gaps between contigs in the Y chromosome assembly or extending from contigs into 220 gaps in the assembly. Of the 92 BAC contigs that aligned concordantly, seven BAC contigs 221 extended into five different gaps in the assembly and 35 BAC contigs spanned 18 different gaps 222 in the assembly (26% of the total gaps in the assembly) ( Figure 1B). The remainder of the aligned 223 BAC contigs aligned completely within an individual contig in the Y assembly. We merged this 224 additional sequence into the initial Y chromosome assembly, resulting in a merged Y chromosome 225 scaffold that contained 52 contigs, totaling 15.78 Mb.

227
The Y chromosome assembly is concordant with known cytogenetic maps

228
The threespine stickleback Y chromosome has undergone at least three pericentric 229 inversions relative to the X chromosome, forming a non-crossover region that spans a majority of 230 the chromosome in males (Ross and Peichel 2008). These inversions were mapped by ordering 231 a series of cytogenetic markers along both the X and Y chromosomes (Figure 2A). To determine 232 whether our Y chromosome assembly was consistent with the known cytogenetic marker 233 ordering, we used BLAST to locate the position of each marker within the assembly. We were 234 able to locate four of the five markers used from the male-specific region in our assembly. The position of these cytogenetic markers was concordant with our assembly ( Figure 2B). The missing fully assembled or it is a true deletion within the Paxton Lake benthic population, relative to the Pacific Ocean marine population used for the cytogenetic map (Ross and Peichel 2008).

239
The location of the oldest region within the Y chromosome (the previously identified 240 stratum one) had been ambiguous. Cytogenetic markers from this region could not be hybridized 241 Figure 2. The threespine stickleback Y chromosome assembly is concordant with cytogenetic maps. (A) The Y chromosome has diverged from the X chromosome through a series of inversions determined through ordering of cytogenetic markers (dashed lines indicate rearrangements of the linear order of markers (Ross and Peichel 2008)). (B) Alignments of the assembled Y chromosome (left) to the X chromosome (right) reveal the same inversions in the de novo assembly. Stratum one is indicated by orange, stratum two is indicated by dark purple, stratum three is indicated by light purple, and the pseudoautosomal region is indicated by yellow. A majority of the pseudoautosomal region is not included in the reference Y chromosome assembly because this region was not targeted (see methods). The location of the candidate sex determination gene (Amhy) is indicated by the black arrow. Centromeres are shown by black circles. Positions are shown in megabases.
to the Y chromosome (Ross and Peichel 2008), suggesting this region may be largely deleted or highly degenerated. Subsequent work using Illumina short-read sequencing revealed that some but the location of these genes within the Y could not be determined by mapping reads to the X 245 chromosome (White et al. 2015). The cytogenetic marker, Idh, is located at the distal end of our 246 Y chromosome assembly, remarkably consistent with the placement of Idh in the cytogenetic map 247 (Ross and Peichel 2008), indicating stratum one is no longer located at the distal end of the Y 248 chromosome as it is on the X chromosome. Instead, we found a high density of stratum one 249 alignments near the boundary of the pseudoautosomal region at the opposite end of the 250 chromosome ( Figure 2B). Within this stratum, there was an overall lower density of alignments 251 between the X and Y chromosome, consistent with previous patterns mapping Illumina short 252 reads to the reference X chromosome (White et al. 2015). The placement of stratum one in the

256
Because we were primarily focused on sequences that were highly divergent from the X 257 chromosome or absent from the female reference genome entirely, our strategy did not target the 258 pseudoautosomal region for assembly into the Y chromosome. Nevertheless, our assembly did 259 place a small fraction of the ~2.5 Mb pseudoautosomal region on the distal end of the male-260 specific Y chromosome, adjacent to stratum one. The cytogenetic marker STN303 was included 261 in this region, which is located on the opposite end of the pseudoautosomal region on the X 262 chromosome ( Figure 2). This discordance in marker placement within the pseudoautosomal 263 region likely indicates a mis-assembly of the region. The pseudoautosomal region contains 264 repetitive sequence, complicating overall assembly of the region (see transposable elements 265 section). Indeed, the contigs spanning this region and STN303 have a smaller size (five contigs; median: 88,098 bp) than the remaining contigs within the Y chromosome or X chromosome, 267 consistent with highly heterozygous, repetitive sequence.

269
The location of centromeric repeats are concordant with a metacentric chromosome

270
A 186 bp centromeric AT-rich alpha satellite repeat was previously identified in female fish 271 by chromatin immunoprecipitation followed by sequencing (ChIP-seq) (Cech and Peichel 2015).

272
Although this repeat hybridized strongly to autosomes and the X chromosome, there was only 273 weak hybridization of the probe to the Y chromosome, suggesting the Y chromosome might have 274 a divergent centromeric repeat and/or contain substantially less satellite DNA than the autosomes 275 (Cech and Peichel 2015). We used ChIP-seq with the same antibody against centromere protein

283
Underlying the CENP-A peak, we found a core centromere AT-rich repeat. We identified 284 14 copies of the repeat in our Y chromosome assembly, which shared an average pairwise 285 sequence identity of 84.6% with the core repeat that hybridized to the remainder of the genome 286 (Cech and Peichel 2015) (Supplemental Figure 6). The repeats fell at the edges of a gap,

287
indicating that a majority of the repeats were not assembled into our primary scaffold. Uneven 288 coverage signal in Hi-C libraries from repetitive DNA can trigger the 3D-DNA assembler to remove 289 these regions from contigs during the editing step (Durand et al. 2016;Dudchenko et al. 2017).

290
Consistent with this, both contigs that flanked the centromere gap in the Y chromosome assembly 291 had additional sequence that was removed by the 3D-DNA pipeline as "debris." The first contig additional 57,692 bp that was removed as "debris." The second contig on the opposite side of the with CENP-A were aligned to the reference Y chromosome assembly. There is a prominent peak between markers STN187 and WT1A where the centromere is located in cytogenetic maps (Ross and Peichel 2008). CENP-A enrichment from a second male fish is shown in Supplemental Figure 5. (B) Alpha satellite monomeric repeats are organized into higher order repeats (HORs). Sequence identity is shown in 100 bp windows across the centromere sequence of the Y chromosome. 87 kb of sequence containing the monomeric repeat was rejoined (crosshatched) to contigs that were previously fragmented in the scaffolding process (orange contig: 11894; yellow contig: 11839). The gap between the two contigs is shown in grey.
gap (contig 11839) had eight copies of the repeat and an additional 29,308 bp of sequence that 295 was removed as "debris." We used BLAST to search for additional repeats in the debris using the 296 majority consensus sequence of the 14 previously identified centromere repeats in the Y 297 assembly. There were an additional 304 repeats in the debris sequence from contig 11894, and 298 163 repeats in the debris sequence from contig 11839. We added the debris sequence back into 299 the total Y chromosome assembly, increasing the assembled centromere size by 87 kb (total Y 300 chromosome length: 15.87 Mb) ( Figure 3B). Average pairwise percent sequence identity among 301 all monomeric repeats in the Y chromosome assembly was 89.5%. Compared to the core 302 threespine stickleback centromere repeat previously identified, the Y chromosome centromere 303 repeat was more divergent. Average pairwise percent sequence identity between all the motifs in 304 the Y chromosome assembly and the centromere repeat identified from female fish was only 305 86.8%.

306
Centromeres are often composed of highly similar blocks of monomeric repeats,  , despite the presence of at least three major inversion events in the cytogenetic map of the read Illumina sequences to the reference X chromosomes, overall divergence could have been biased by mapping artifacts, especially in the oldest region of the Y chromosome. We investigated 321 whether our Y chromosome assembly supported the earlier model of two evolutionary strata or 322 whether there could be additional strata uncovered in the current de novo assembly. We aligned 323 all ENSEMBL predicted X chromosome coding regions outside of the pseudoautosomal region to 324 the Y chromosome reference assembly to estimate divergence. Of the 1187 annotated coding 325 sequences, we were able to align 504 (42.5%) to the male-specific region of the Y chromosome.

326
We found a clear signature of three evolutionary strata, consistent with inversion breakpoints 327 within the cytogenetic map as well as within our de novo reference assembly. The oldest stratum Figure 4. The sex chromosomes have three distinct evolutionary strata. Synonymous divergence (dS) between the X and Y chromosome was estimated for every annotated transcript on the X chromosome. Genes are ordered by position on the X chromosome (Mb). Median divergence across each region is shown by the red line; values are given in Table 2. Strata breakpoints are indicated by the vertical dashed lines. The centromere is indicated by a black circle.
(stratum one) encompassed the same region of the X chromosome as previously described in 329 the Illumina-based study and had highly elevated dS (stratum one median dS: 0.155). In contrast 330 to the Illumina-based estimates, our new assembly revealed that the remainder of coding regions Table 1). We also investigated whether the older strata had increased non-synonymous 333 divergence (dN) consistent with inefficient selection from the lack of crossing over between the 334 chromosomes (Charlesworth 1978;Rice 1987). As predicted, stratum one had a significantly 335 higher dN than strata two and three (Table 1). Stratum two had a significantly lower dN than the 336 other strata. This was also reflected by a significantly lower dN/dS ratio (Table 1), suggesting genes 337 in stratum two are under stronger purifying selection.

339
The Y chromosome is evolving a unique genetic architecture 340 Haploinsufficient genes have been repeatedly retained on degenerating sex 341 chromosomes of mammals and birds (Bellott et al. 2014;2017), and may be enriched in stratum 342 one of the stickleback Y chromosome (White et al. 2015). We explored whether our expanded set 343 of annotated genes exhibited signatures of haploinsufficiency by identifying orthologs between 344 the X-annotated genes and human genes ranked for haploinsufficiency (Decipher

345
Haploinsufficiency Predictions (DHP) v. 3) (Firth et al. 2009;Huang et al. 2010). Within strata one 346 and two, we found orthologs with a retained Y-linked allele had lower DHP scores than genes without a Y ortholog, indicating that retained genes were more likely to exhibit haploinsufficiency  . Genes retained on the Y chromosome in strata one and two are more likely to exhibit haploinsufficiency. Human proteins with predicted haploinsufficiency indexes, in which a lower value indicates that a gene is more likely to be haploinsufficient, were matched to one-to-one human-threespine stickleback fish orthologs from the X chromosome. Haploinsufficiency indexes were significantly lower for genes retained on both the X and Y chromosomes than for genes present only on the X chromosome (i.e. lost from the Y chromosome) in both strata one and two. Asterisks indicate P < 0.05 (Mann-Whitney U test).
used the MAKER gene annotation pipeline (Cantarel et al. 2008;Holt and Yandell 2011) to 361 assemble a complete set of coding regions across the Y chromosome reference sequence. We 362 identified a total of 626 genes across the male-specific region of the Y chromosome, 33 of which 363 had paralogs on autosomes, but not on the X chromosome (5.3%) (

378
copy genes with a homolog on the X chromosome ( Figure 6; Mann-Whitney U Test; P < 0.05).

379
Because DNA-based translocations of genes often contain their native regulatory elements, we  Figure 6. Genes present on the Y that have been translocated from the autosomes or genes that have been duplicated on the Y and are derived from ancestral X-linked homologs show testis-biased gene expression. Log2 fold change between testis tissue and three other tissues (brain, larvae, and liver) is shown. For each tissue comparison, asterisks denote groups with significantly different expression from single-copy genes present on the Y that are derived from X-linked homologs (Mann-Whitney U Test; P < 0.05).

391
Duplicated genes on the Y chromosome can also be derived from ancestral genes shared 392 between the X and Y. Of the 626 genes annotated across the male-specific region of the Y 393 chromosome, 47 (7.5%) had greater than one copy on the Y chromosome and also had an X-394 linked allele ( Table 2). None of these genes were structured within large amplicons, which are

408
Transposable elements also rapidly accumulate on sex chromosomes once chromosome has a higher density of transposable elements throughout the male-specific region of the Y chromosome, compared to the X chromosome (Supplemental Figure 7). We found the highest densities within stratum one, consistent with recombination being suppressed in this 413 region for the greatest amount of time. We also found a high density of transposable elements 414 within the recombining pseudoautosomal region (Supplemental Figure 7).

416
Stratum one contains a candidate sex determination gene

417
The master sex determination gene has not been identified in the threespine stickleback.

418
Although master sex determination genes can be highly variable among species (Bachtrog et al.  Using the same calibration, stratum two formed less than 5.9 million years ago and stratum three 467 formed less than 4.7 million years ago.

468
Our complete scaffold of the Y chromosome allowed us to refine the previous evolutionary

505
alternative explanation is that the weak hybridization signal is not due to the differences in 506 monomeric repeat sequence, but it is actually caused by a reduction in overall size of the Y 507 chromosome centromere. Although we isolated ~87 kb of centromere sequence, we did not 508 identify a contig that spans the complete centromere, leaving the actual size of the centromere 509 unknown. Additional sequencing work is necessary to test this alternative model.

511
The genetic architecture of the threespine stickleback Y chromosome is rapidly evolving

512
Despite the young age of the threespine stickleback Y chromosome relative to mammals,

528
Gene expression patterns of duplicated and translocated genes suggest this process is 529 not entirely neutral. We observed strong testis-biased expression among genes that had 530 duplicated and translocated to the Y chromosome, similar to patterns observed on other Y 531 chromosomes (Carvalho et al. 2000;Skaletsky et al. 2003;Murphy et al. 2006;Hughes et 532 al. 2010;Paria et al. 2011;Soh et al. 2014;Mahajan and Bachtrog 2017;Janečka et al. 2018).

533
Interestingly, we observed multiple ways that testis-biased genes can accumulate on the Y

605
DNA was isolated from the lysate by adding 10 mL of buffered phenol/chloroform/isoamyl-alcohol,

606
rotating slowly at room temperature for 30 minutes, followed by centrifuging at 4ºC for 1 minute 607 at 2000 xg. Two further extractions were performed by adding 10 mL of chloroform, rotating slowly 608 at room temperature for 1 hour, followed by centrifuging at 4ºC for 1 minute at 2000 xg. DNA was 609 precipitated using 1 mL of 3M sodium-acetate (pH 6.0) and 10 mL of cold 100% ethanol. The haplotigs of the autosomal contigs by aligning all the contigs to each other using nucmer (Kurtz 628 et al. 2004) and filtering for alignments between contigs that were at least 1 kb in length and had 629 at least 98% sequence identity (to account for the elevated heterozygosity).

631
Hi-C proximity guided assembly

632
The X and Y chromosomes of threespine stickleback share a considerable amount of 633 sequence homology (White et al. 2015). In order to differentiate X-linked and Y-linked Canu 634 contigs for scaffolding, we aligned the contigs to the revised reference X chromosome sequence 635 (Peichel et al. 2017), using nucmer in the MUMmer package (v. 4.0) (Kurtz et al. 2004). Putative X-and Y-linked contigs were separated by overall sequence identity. Putative X-linked contigs 637 were defined as having more than 25% of the contig length aligned to the reference X 638 chromosome with a sequence identity greater than 96%, whereas putative Y-linked contigs were 639 defined as having a sequence identity with the reference X chromosome of less than 96%. Contigs 640 which had less than 25% of the length aligning to the reference genome or did not align at all 641 were retained as putative Y-linked unique sequence. Selection of the sequence identity threshold 642 was guided by our overall ability to re-assemble the known X chromosome sequence from the set 643 of putative X-linked PacBio Canu contigs. We tested thresholds from 92% sequence identity to 98% sequence identity and chose the threshold that resulted in the smallest size difference between the PacBio assembly and the X chromosome sequence from the reference assembly To scaffold the contigs, we used chromosome conformation capture (Hi-C) sequencing 649 and proximity-guided assembly. Hi-C sequencing was previously conducted from a lab-reared

666
2004), which was made from two wild-caught males from the same Paxton Lake benthic 667 population (Texada Island, British Columbia, Canada) used for the PacBio and Hi-C sequencing.
were used as probes to screen the CHORI-215 BAC library filters, and putative Y-specific BACs used in a BLAST (blastn) search of the threespine stickleback genome assembly, which was generated from an XX female (Jones et al. 2012). All BACs for which neither end mapped to the 675 genome or had elevated sequence divergence from the X chromosome were considered as 676 candidate Y-chromosome BACs. These candidate BACs were verified to be Y-specific using 677 fluorescent in situ hybridization (FISH) on male metaphase spreads, following previously 678 described protocols (Ross and Peichel 2008;Urton et al. 2011

690
Phred/Phrap/Consed suite of programs were then used for assembling and editing the sequence 691 Gordon et al. 1998). After manual inspection of the 692 assembled sequences, finishing was performed both by resequencing plasmid subclones and by 693 walking on plasmid subclones or the BAC clone using custom primers. All finishing reactions were performed using dGTP BigDye Terminator Chemistry (Applied Biosystems, USA). Finished

Alignment of BAC sequences and merging assemblies
Sequenced BAC inserts were aligned to the scaffolded Y chromosome using nucmer (v.
actual end of the Sanger sequenced BAC, the full length of the alignment was within 10 kb of the

727
Repeat families were first summarized from the RepeatMasker output using the Perl tool, "one

755
Only hits that had an alignment length ±10 bp of the core 187 bp repeat were retained. Average 756 percent identity was calculated among the remaining BLAST hits. We determined a majority 757 consensus sequence from the core 14 centromere repeat units from the initial Y chromosome 758 assembly. The majority consensus was used to identify additional repeats in the "debris" 759 fragments that flanked the gap in the scaffold where the Y centromere was originally identified.

760
The majority consensus was aligned to the debris fragments using BLAST (

818
We aligned sequences to the masked revised whole-genome reference assembly ( (Kim et al. 2013). Default parameters were used except for 820 the liver and testis tissues. For these tissues, we used --read-mismatches 4 and --read-edit-dist 821 4 to account for the greater number of SNPs in the 150 bp reads. These alignment parameters 822 produced an overall alignment rate to the masked genome of 80.4% for the brain tissue, 68.0% 823 in the adult liver tissue, 64.5% in the juvenile liver tissue, 64.7% for adult testis tissue, 65.5% for 824 the juvenile testis tissue, and 68.9% for the larval tissue. Aligned reads from all samples within a (Roberts et al. 2011) with default parameters. Exons from the GTF file were converted to FASTA sequences with gffread.
from the RNA-seq transcripts and all annotated protein sequences from ENSEMBL (release 95) 830 using default parameters and est2genome=1, protein2genome=1 to infer gene predictions directly 831 from the transcripts and protein sequences. We used these gene models to train SNAP. In addition, Augustus was trained using gene models from BUSCO conserved orthologs found on the PacBio scaffolded Y chromosome and the revised reference assembly (Glazer et al. 2015; model, the previous Augustus model, and est2genome=0 and protein2genome=0. The threespine 839 stickleback repeat library derived using RepeatModeler was used during the annotation pipeline 840 using the rmlib option.

897
Marcel Häsler provided advice on the isolation of high molecular weight DNA. Amanda Bruner 898 provided guidance in staging the threespine stickleback embryos.

901
The authors have no conflicts of interest to disclose.