- Open Access
Molecular evolutionary trends and feeding ecology diversification in the Hemiptera, anchored by the milkweed bug genome
- Kristen A. Panfilio1, 2Email authorView ORCID ID profile,
- Iris M. Vargas Jentzsch1View ORCID ID profile,
- Joshua B. Benoit3View ORCID ID profile,
- Deniz Erezyilmaz4, 43View ORCID ID profile,
- Yuichiro Suzuki5View ORCID ID profile,
- Stefano Colella6, 7View ORCID ID profile,
- Hugh M. Robertson8,
- Monica F. Poelchau9View ORCID ID profile,
- Robert M. Waterhouse10, 11View ORCID ID profile,
- Panagiotis Ioannidis10View ORCID ID profile,
- Matthew T. Weirauch12View ORCID ID profile,
- Daniel S. T. Hughes13,
- Shwetha C. Murali13, 14, 15,
- John H. Werren16View ORCID ID profile,
- Chris G. C. Jacobs17, 18View ORCID ID profile,
- Elizabeth J. Duncan19, 20View ORCID ID profile,
- David Armisén21,
- Barbara M. I. Vreede22View ORCID ID profile,
- Patrice Baa-Puyoulet6,
- Chloé S. Berger21,
- Chun-che Chang23, 45View ORCID ID profile,
- Hsu Chao13,
- Mei-Ju M. Chen9View ORCID ID profile,
- Yen-Ta Chen1View ORCID ID profile,
- Christopher P. Childers9View ORCID ID profile,
- Ariel D. Chipman22View ORCID ID profile,
- Andrew G. Cridge19View ORCID ID profile,
- Antonin J. J. Crumière21View ORCID ID profile,
- Peter K. Dearden19View ORCID ID profile,
- Elise M. Didion3View ORCID ID profile,
- Huyen Dinh13View ORCID ID profile,
- Harsha Vardhan Doddapaneni13View ORCID ID profile,
- Amanda Dolan16, 24,
- Shannon Dugan13View ORCID ID profile,
- Cassandra G. Extavour25, 26View ORCID ID profile,
- Gérard Febvay6View ORCID ID profile,
- Markus Friedrich27View ORCID ID profile,
- Neta Ginzburg22,
- Yi Han13View ORCID ID profile,
- Peter Heger28,
- Christopher J. Holmes3View ORCID ID profile,
- Thorsten Horn1View ORCID ID profile,
- Yi-min Hsiao23, 45View ORCID ID profile,
- Emily C. Jennings3View ORCID ID profile,
- J. Spencer Johnston29View ORCID ID profile,
- Tamsin E. Jones25View ORCID ID profile,
- Jeffery W. Jones27,
- Abderrahman Khila21View ORCID ID profile,
- Stefan Koelzer1,
- Viera Kovacova30View ORCID ID profile,
- Megan Leask19,
- Sandra L. Lee13,
- Chien-Yueh Lee9View ORCID ID profile,
- Mackenzie R. Lovegrove19,
- Hsiao-ling Lu23, 45View ORCID ID profile,
- Yong Lu31,
- Patricia J. Moore32View ORCID ID profile,
- Monica C. Munoz-Torres33View ORCID ID profile,
- Donna M. Muzny13View ORCID ID profile,
- Subba R. Palli34View ORCID ID profile,
- Nicolas Parisot6View ORCID ID profile,
- Leslie Pick31View ORCID ID profile,
- Megan L. Porter35View ORCID ID profile,
- Jiaxin Qu13View ORCID ID profile,
- Peter N. Refki21, 36,
- Rose Richter16, 37,
- Rolando Rivera-Pomar38View ORCID ID profile,
- Andrew J. Rosendale3View ORCID ID profile,
- Siegfried Roth1View ORCID ID profile,
- Lena Sachs1,
- M. Emília Santos21,
- Jan Seibert1,
- Essia Sghaier21,
- Jayendra N. Shukla34, 39View ORCID ID profile,
- Richard J. Stancliffe40, 44View ORCID ID profile,
- Olivia Tidswell19, 41,
- Lucila Traverso42,
- Maurijn van der Zee17View ORCID ID profile,
- Séverine Viala21,
- Kim C. Worley13View ORCID ID profile,
- Evgeny M. Zdobnov10,
- Richard A. Gibbs13 and
- Stephen Richards13View ORCID ID profile
© The Author(s). 2019
- Received: 2 November 2018
- Accepted: 21 February 2019
- Published: 2 April 2019
The Hemiptera (aphids, cicadas, and true bugs) are a key insect order, with high diversity for feeding ecology and excellent experimental tractability for molecular genetics. Building upon recent sequencing of hemipteran pests such as phloem-feeding aphids and blood-feeding bed bugs, we present the genome sequence and comparative analyses centered on the milkweed bug Oncopeltus fasciatus, a seed feeder of the family Lygaeidae.
The 926-Mb Oncopeltus genome is well represented by the current assembly and official gene set. We use our genomic and RNA-seq data not only to characterize the protein-coding gene repertoire and perform isoform-specific RNAi, but also to elucidate patterns of molecular evolution and physiology. We find ongoing, lineage-specific expansion and diversification of repressive C2H2 zinc finger proteins. The discovery of intron gain and turnover specific to the Hemiptera also prompted the evaluation of lineage and genome size as predictors of gene structure evolution. Furthermore, we identify enzymatic gains and losses that correlate with feeding biology, particularly for reductions associated with derived, fluid nutrition feeding.
With the milkweed bug, we now have a critical mass of sequenced species for a hemimetabolous insect order and close outgroup to the Holometabola, substantially improving the diversity of insect genomics. We thereby define commonalities among the Hemiptera and delve into how hemipteran genomes reflect distinct feeding ecologies. Given Oncopeltus’s strength as an experimental model, these new sequence resources bolster the foundation for molecular research and highlight technical considerations for the analysis of medium-sized invertebrate genomes.
- Transcription factors
- Gene structure
- Lateral gene transfer
- Gene family evolution
- Evolution of development
The number of animals with sequenced genomes continues to increase dramatically, and there are now over 100 insect species with assembled and annotated genomes . However, the majority belong to the Holometabola (e.g., flies, beetles, wasps, butterflies), the group characterized by a biphasic life history with distinct larval and adult phases separated by dramatic metamorphosis during a pupal stage. The Holometabola represent only a fraction of the full morphological and ecological diversity across the Insecta: over half of all orders are hemimetabolous. Imbalance in genomic resources limits the exploration of this diversity, including the environmental and developmental requirements of a hemimetabolous lifestyle with a progression of flightless nymphal (juvenile) instars. Addressing this paucity, we report comparative analyses based on genome sequencing of the large milkweed bug, Oncopeltus fasciatus, as a hemimetabolous representative of the larger diversity of insects.
The milkweed bug has feeding ecology traits that are both conservative and complementary to those of previously sequenced hemipterans. Its phytophagy is ancestral for the large infraorder Pentatomomorpha and representative of most extant Hemiptera . Moreover, as a seed feeder, Oncopeltus has not undergone the marked lifestyle changes associated with fluid feeding (mucivory or hematophagy), including dependence on endosymbiotic bacteria to provide nutrients lacking in the diet. Gene loss in the pea aphid, Acyrthosiphon pisum, makes it reliant on the obligate endosymbiont Buchnera aphidicola for synthesis of essential amino acids [6, 17]. Although hematophagy arose independently in Rhodnius and Cimex , their respective endosymbionts, Rhodococcus rhodnii and Wolbachia, must provide vitamins lacking in a blood diet . In contrast, the seed-feeding subfamily Lygaeinae, including Oncopeltus, is notable for the absence of prominent endosymbiotic anatomy: these bugs lack both the midgut crypts that typically house bacteria and the bacteriomes and endosymbiotic balls seen in other Lygaeidae .
As the native food source of Oncopeltus is the toxic milkweed plant, its own feeding biology has a number of interesting implications regarding detoxification and sequestration of cardenolide compounds. A prominent consequence of this diet is the bright red-orange aposematic (warning) coloration seen in Oncopeltus embryos, nymphs, and adults [20, 21]. Thus, diet, metabolism, and body pigmentation are functionally linked biological features for which one may expect changes in gene repertoires to reflect the diversity within an order, and the Hemiptera provide an excellent opportunity to explore this.
Furthermore, Oncopeltus has been an established laboratory model organism for over 60 years, with a rich experimental tradition in a wide range of studies from physiology and development to evolutionary ecology [21–23]. It is among the few experimentally tractable hemimetabolous insect species, and it is amenable to a range of molecular techniques (e.g., [24–26]). In fact, it was one of the first insect species to be functionally investigated by RNA interference (RNAi, ). RNAi in Oncopeltus is highly effective across different life history stages, which has led to a resurgence of experimental work over the past 15 years, with a particular focus on the evolution of developmentally important regulatory genes (reviewed in ).
Here, we focus on these two themes—feeding biology diversity within the Hemiptera and Oncopeltus as a research model for macroevolutionary genetics. Key insights derive from a combination of global comparative genomics and detailed computational analyses that are supported by extensive manual curation, empirical data for gene expression, sequence validation, and new isoform-specific RNAi. We thereby identify genes with potentially restricted life history expression in Oncopeltus and that are unique to the Hemiptera, clarify evolutionary patterns of zinc finger protein family expansion, categorize predictors of insect gene structure, and identify lateral gene transfer and amino acid metabolism features that correlate with feeding biology.
The genome and its assembly
Oncopeltus fasciatus genome metrics
926 Mb (mean between males and females)
1099 Mb (contigs only, 774 Mb)
106.9× raw coverage, 83.7% of reads in final assembly
genome, 32.7%; protein-coding sequence (OGS v1.2), 42%
OGS v1.1 (curated fraction)
19,690 models1 (1426 models, 7.2%)
19,465 genes (1201 genes, 6.2%)
OGS v1.2 (curated fraction)
19,809 models1 (1697 models, 8.7%)
19,616 genes (1518 genes, 7.7%)
As template DNA was prepared from dissected adults from which the gut material was removed, the resulting assembly is essentially free of contamination. Only five small scaffolds had high bacterial homology, each to a different, partial bacterial genome (Additional file 1: Supplemental Note 2.2).
The official gene set and conserved gene linkage
The official gene set (OGS) was generated by automatic annotation followed by manual curation in a large-scale effort by the research community (Additional file 1: Supplemental Notes Sections 3–4). Curation revised automatic models, added alternative isoforms and de novo models, and documented multiple models for genes split across scaffolds. We found that automatic predictions were rather conservative for hemipteran gene structure (see below). Thus, manual curation often extended gene loci as exons were added, including merging discrete automatic models (Additional file 1: Supplemental Note 4, and Table S4.4). The OGS v1.1 was generated for global analyses to characterize the gene repertoire. The latest version, OGS v1.2, primarily adds chemoreceptor genes of the ionotropic and odorant receptor classes and genes encoding metabolic enzymes. Altogether, the research community curated 1697 gene models (8.7% of OGS v1.2), including 316 de novo models (Additional file 2: Table S4.1, Additional file 1: Supplemental Notes Section 5). The majority of curated models are for genes encoding cuticular proteins (11%), chemoreceptors (19%), and developmental regulators such as transcription factors and signaling pathway components (40%, including the BMP/TGF-β, Toll/NF-κB, Notch, Hedgehog, Torso RTK, and Wnt pathways).
In addition to assessing gene model quality, manual curation of genes whose orthologs are expected to occur in syntenic clusters also validates assembly scaffolding. Complete loci could be found for single orthologs of all Hox cluster genes, where Hox3/zen and Hox4/Dfd are linked in the current assembly and have ≥ 99.9% nucleotide identity with experimentally validated sequences ([27, 33, 34], Additional file 1: Supplemental Note 5.1.b). Conserved linkage was also confirmed for the homeobox genes of the Iroquois complex, the Wnt ligands wingless and wnt10, and two linked pairs from the Runt transcription factor complex (Additional file 1: Supplemental Notes 5.1.a, 5.1.c, 5.1.i, 5.1.j). Further evidence for correct scaffold assembly comes from the curation of large, multi-exonic loci. For example, the cell polarity and cytoskeletal regulator encoded by the conserved furry gene includes 47 exons spanning a 437-kb locus, which were all correctly assembled on a single scaffold.
Gene expression profiles across the milkweed bug life cycle
The OGS also incorporates many partial and unidentified 454 transcripts, nearly trebling the transcripts with an assigned gene model or homology compared to the original study (from 9 to 26%, by blastn, e < 10−9). This included 10,130 transcripts that primarily mapped to UTRs and previously lacked recognizable coding sequence, such as for the Oncopeltus brinker ortholog, a BMP pathway component (, Additional file 1: Supplemental Note 5.1.f), and the enzyme-encoding genes CTP synthase and roquin. At the same time, the transcriptomes provided expression support for the identification of multiple isoforms in the OGS, such as for the germline determinant nanos . More generally, most OGS gene models have expression support (91% of 19,690), with 74% expressed broadly in at least 3 of 4 samples (Fig. 2b). The inclusion of a fifth dataset from a published adult library  provided only a 1% gain in expression support, indicating that with the current study the expression data volume for Oncopeltus is quite complete.
RNA-seq studies were further conducted to establish male-, female-, and nymph-specific gene sets (Fig. 2b, c, Additional file 1: Supplemental Note 2.4), from which we also infer that the published adult dataset of unspecified sex is probably male. Moreover, most genes with stage-restricted or stage-enriched expression are in our male sample (Fig. 2b, c). For example, gustatory receptor (GR) genes show noticeable restriction to the adult male and published adult (probable male) samples (n = 169 GRs: 40% no expression, 27% only expressed in these two samples), with half of these expressed in both biological replicates (52%). Interestingly, the nymphal sample is enriched for genes encoding structural cuticular proteins (94%, which is > 56% more than any other sample). This likely reflects the ongoing molting cycles, with their cyclical upregulation of chitin metabolism and cuticular gene synthesis , that are experienced by the different instars and molt cycle stages of individuals pooled in this sample. Lastly, gene sets with sex-specific enrichment across several hemipteroid species substantiate known aspects of male and female reproduction (Fig. 2c: serine-threonine kinases  or vitellogenin and other factors associated with oocyte generation, respectively). Some of these enriched genes have unknown functions and could comprise additional, novel factors associated with reproduction in Oncopeltus.
Protein orthology and hemipteran copy number comparisons
We then expanded our appraisal to the Benchmarking Universal Single-Copy Orthologs dataset of 1658 Insecta genes (BUSCO v3, ). Virtually all BUSCO genes are present in the Oncopeltus OGS (98.9%, Fig. 3b, Additional file 1: Supplemental Note 6.1). Although some genes are fragmented, the assembly has a high level of BUSCO completeness (94.6%), independent of the annotation prediction limitations that missed some exons from current gene models. Furthermore, BUSCO assessments can elucidate potential consequences of high heterozygosity, which could result in the erroneous inclusion of multiple alleles for a single gene. In fact, the fraction of duplicated BUSCO genes in Oncopeltus (1.4%) is low, compared to both the well-assembled bed bug genome (2.2%, ) and the pea aphid (4.8%), which is known to have lineage-specific duplications [6, 43]. Thus, by these quality metrics, the Oncopeltus OGS and assembly are comparable to those of fellow hemipterans, strongly supporting the use of these resources in further comparisons.
We next categorized all proteins by conservation in global, clustering-based orthology analyses (OrthoDB, [1, 44]). As in most species, half of Oncopeltus proteins are highly conserved (Fig. 3a). Moreover, 98% of all Oncopeltus protein-coding genes have homology, expression, and/or curation support (Fig. 3c). Proteins without homology include species-specific chemoreceptors and antimicrobial peptides (Additional file 1: Supplemental Note 5.1.h), as well as potentially novel or partial models. Overall, we estimate that the Oncopeltus protein repertoire is comparable to that of other insects in size and conservation. For the Hemiptera, Oncopeltus also has fewer missing orthology groups than either the kissing bug or pea aphid (Additional file 1: Table S6.1). Indeed, the pea aphid is a notable outlier, with its long branch in the phylogeny and for its large protein-coding gene content with low conservation (Fig. 3a). As more hemipteran genomes are sequenced, other species now offer less derived alternatives for phylogenomic comparisons.
Compared to the pea aphid , Oncopeltus is more conservative in presence and copy number for several signaling pathway components. In contrast to gene absences in the pea aphid, Oncopeltus retains orthologs of the EGF pathway component sprouty, the BMP receptor wishful thinking, and the hormone nuclear receptor Hr96 (Additional file 1: Supplemental Note 5.1.e). Also, whereas multiple copies were reported for the pea aphid, we find a single Oncopeltus ortholog for the BMP pathway components decapentaplegic and Medea and the Wnt pathway intracellular regulator encoded by shaggy/GSK-3, albeit with five potential isoforms of the latter (Additional file 1: Supplemental Notes 5.1.f, 5.1.j). Duplications of miRNA and piRNA gene silencing factors likewise seem to be restricted to the pea aphid, even compared to other aphid species (, Additional file 1: Supplemental Note 5.4.a). However, our survey of Oncopeltus and other hemimetabolous species reveals evidence for frequent, independent duplications of the Wnt pathway component armadillo/β-catenin (, Additional file 1: Supplemental Note 5.1.j). Curiously, Oncopeltus appears to encode fewer histone loci than any other arthropod genome and yet exhibits a similar, but possibly independent, pattern of duplications of histone acetyltransferases to those previously identified in Cimex and the pea aphid (Additional file 1: Supplemental Note 5.4.c).
On the other hand, we documented several notable Oncopeltus-specific duplications. For the BMP transducer Mad, we find evidence for three paralogs in Oncopeltus, where two occur in tandem and may reflect a particularly recent duplication (Additional file 1: Supplemental Note 5.1.f). Similarly, a tandem duplication of wnt8 appears to be unique to Oncopeltus (Additional file 1: Supplemental Note 5.1.j). More striking is the identification of six potential paralogs of cactus, a member of the Toll/NF-κB signaling pathway for innate immunity, whereas the bed bug and kissing bug each retain only a single copy (, Additional file 1: Supplemental Note 5.1.g).
Lastly, we explored hemipteran-specific orthology groups against a backdrop of 107 other insect species . What makes a bug a bug in terms of protein-coding genes? Several orthogroups contain potentially novel genes that show no homology outside the Hemiptera and await direct experimental analysis, for which the Hemiptera are particularly amenable (e.g., [5, 48–51]). Secondly, there are hemipteran-specific orthogroups of proteins with recognized functional domains and homologs in other insects, but where evolutionary divergence has led to lineage-specific subfamilies. One example is a heteropteran-specific cytochrome P450 (CYP) enzyme (EOG090W0V4B), which in Oncopeltus is expressed in all life history stages (Fig. 2b). The expansion of CYP protein families is associated with potential insecticide resistance, as specific P450s can confer resistance to specific chemicals (e.g., [52, 53]; Additional file 1: Supplemental Notes 5.3.b, 5.3.c). Hence, the identification of lineage-specific CYP enzymes can suggest potential targets for integrated pest management approaches.
Transcription factor repertoires and homeobox gene evolution
Having explored the global protein repertoire, we next focused specifically on transcription factors (TFs), which comprise a major class of proteins that has been extensively studied in Oncopeltus. This is a class of key regulators of development whose functions can diverge substantially during evolution and for which RNAi-based experimental investigations have been particularly fruitful in the milkweed bug (e.g., [27, 33, 54–56], Additional file 1: Supplemental Notes 5.1.a-e).
We were able to infer DNA binding motifs for 25% (n = 189) of Oncopeltus TFs, mostly based on data from Drosophila melanogaster (121 TFs) but also from distantly related taxa such as mammals (56 TFs). Such high conservation is further reflected in explicit orthology assignments for most proteins within several large TF families, including the homeodomain (53 of 85, 62%), basic helix-loop-helix (bHLH, 35 of 45, 78%), and forkhead box (16 of 17, 94%) families. In contrast, most C2H2 zinc finger proteins lack orthology assignment (only 22 of 360, 6%). Across species, the homeodomain and C2H2 zinc finger proteins are the two largest TF superfamilies (Fig. 4a). Given their very different rates of orthology assignment, we probed further into their pipeline predictions and the patterns of evolutionary diversification.
The number of homeodomain proteins identified by the pipeline displays a narrow normal distribution across species (Fig. 4b, mean ± standard deviation 97 ± 9), consistent with a highly conserved, slowly evolving protein family. Supporting this, many Oncopeltus homeodomain proteins that were manually curated also received a clear orthology assignment (Fig. 4c: pink), with only 4 exceptions (Fig. 4c: yellow). Only 1 case suggests a limitation of a pipeline that is not specifically tuned to hemipteran proteins (Goosecoid). Manual curation of partial or split models identified 11 further genes encoding homeodomains, bringing the actual tally in Oncopeltus to 96. Overall, we find the TF pipeline results to be a robust and reasonably comprehensive representation of these gene classes in Oncopeltus.
These analyses also uncovered a correction to the published Oncopeltus literature for the developmental patterning proteins encoded by the paralogs engrailed and invected. These genes arose from an ancient tandem duplication prior to the hexapod radiation. Their tail-to-tail orientation enables ongoing gene conversion , making orthology discrimination particularly challenging. For Oncopeltus, we find that the genes also occur in a tail-to-tail orientation and that invected retains a diagnostic alternative exon . These new data reveal that the purported Oncopeltus engrailed ortholog in previous developmental studies (e.g., [54, 59–62]) is in fact invected (Additional file 1: Supplemental Note 5.1.a).
Independent expansions of C2H2 zinc fingers within the Hemiptera
Unlike homeodomain proteins, C2H2 zinc finger (C2H2-ZF) repertoires are prominent for their large family size and variability throughout the animal kingdom , and this is further supported by our current analysis in insects. With > 350 C2H2-ZFs, Oncopeltus, the pea aphid, termite, and some mosquito species have 1.5× more members than the insect median (Fig. 4b). This is nearly half of all Oncopeltus TFs. While the expansion in mosquitoes could have a single origin within the Culicinae, the distribution in the Hemiptera, where Cimex has only 227 C2H2-ZFs, suggests that independent expansions occurred in Oncopeltus and the pea aphid. Prior to the sequencing of other hemipteran genomes, the pea aphid’s large C2H2-ZF repertoire was attributed to the expansion of a novel subfamily, APEZ, also referred to as zinc finger 271-like .
In fact, manual curation in Oncopeltus confirms the presence of a subfamily with similar characteristics to APEZ (Fig. 4c: yellow fraction). In Oncopeltus, we find >115 proteins of the ZF271 class that are characterized by numerous tandem repeats of the C2H2-ZF domain and its penta-peptide linker, with 3–45 repeats per protein.
Intriguingly, we find evidence for ongoing evolutionary diversification of this subfamily. A number of Oncopeltus ZF271-like genes occur in tandem clusters of 4–8 genes—suggesting recent duplication events. Yet, clustered genes differ in gene structure (number and size of exons), and we identified a number of probable ZF271-like pseudogenes whose open reading frames have become disrupted—consistent with high turnover. Oncopeltus ZF271-like proteins also differ in the sequence and length of the zinc finger domains among themselves and compared to aphid proteins (WebLogo analysis, ), similar to zinc finger array shuffling seen in humans . Furthermore, whole-protein phylogenetic analysis supports independent, rapid expansions in the pea aphid and Oncopeltus (Fig. 4d).
Clustered zinc finger gene expansion has long been recognized in mammals, with evidence for strong positive selection to increase both the number and diversity of zinc finger domains per protein as well as the total number of proteins . This was initially found to reflect an arms race dynamic of co-evolution between selfish transposable elements and the C2H2-ZF proteins that would repress them . In vertebrates, these C2H2-ZF proteins bind to the promoters of transposable elements via their zinc finger arrays and use their Krüppel-associated box (KRAB) domain to bind the chromatin-remodeling co-repressor KAP-1, which in turn recruits methyltransferases and deacetylases that silence the targeted promoter .
Insects do not have a direct ortholog of vertebrate KAP-1 (Additional file 1: Supplemental Note 5.4.d), and neither the aphid nor Oncopeltus ZF271-like subfamilies possess a KRAB domain or any other domain besides the zinc finger arrays. However, close molecular outgroups to this ZF271-like subfamily include the developmental repressor Krüppel  and the insulator protein CTCF  (data not shown). Like these outgroups, the Oncopeltus ZF271-like genes are strongly expressed: 98% have expression support, with 86% expressed in at least three different life history stages (Fig. 2b). Thus, the insect ZF271-like proteins may also play prominent roles in repressive DNA binding. Indeed, we find evidence for a functional methylation system in Oncopeltus (Additional file 1: Supplemental Note 5.4.c), like the pea aphid, which would provide a means of gene silencing by chromatin remodeling, albeit via mediators other than KAP-1.
However, an arms race model need not be the selective pressure that favors insect ZF271-like family expansions. Recent analyses in vertebrates identified sophisticated, additional regulatory potential by C2H2-ZF proteins, building upon original transposable element binding for new, lineage-specific and even positive gene regulation roles [65, 71, 72]. Moreover, although Cimex has half as many long terminal repeat (LTR) repetitive elements as Oncopeltus and the pea aphid, overall, we do not find a correlation between relative or absolute repetitive content and ZF271-like family expansion within the Hemiptera (see the next section).
Proportional repeat content across hemipterans
To support direct comparisons among hemipterans, we also performed our RepeatModeler analysis on the bed bug and pea aphid assemblies. Repeats comprised 36% and 31% of the respective assemblies, similar to the gap-filled value of 32% in Oncopeltus. Nevertheless, given the smaller sizes of these species’ assemblies—651 Mb in the bed bug and 542 Mb in the pea aphid—the absolute repeat content is much higher in Oncopeltus (Fig. 5b). Excluding unknown repeats, the most abundant transposable elements in Oncopeltus are LINE retrotransposons, covering 10% of the assembly (Additional file 2: Table S2.5). This is also the case in the bed bug (12%), while in the pea aphid DNA transposons with terminal inverted repeats (TIRs) are the most abundant (2% of the assembly identified here and 4% reported from manual curation in the pea aphid genome paper, ). Across species, the remaining repeat categories appear to grow proportionally with assembly size, except for simple repeats, which were the category with the largest relative increase in size after gap filling in Oncopeltus (Additional file 1: Supplemental Note 2.3). However, given the mix of data types (Illumina  and Sanger ), these patterns should be treated as hypotheses for future testing.
Lineage- and genome size-related trends in insect gene structure
Both our manual curation work and BUSCO analyses highlighted the fact that Oncopeltus genes are often comprised of many, small exons. We thus undertook a comparative analysis to determine whether this is a general feature to be considered for structural annotation of hemipteran genomes. We find that both lineage and genome size can serve as predictors of gene structure.
To test these trends, we next expanded our analysis to all manually curated exons in two species from each of three orders (Hemiptera, Coleoptera, Diptera). Here, we expect that curated exon sizes are accurate, without the need to assume that the entire gene models are complete. This large dataset corroborates our original findings, with bugs having small exons while both the median and Q3 quartile reflect larger exon sizes in beetles and flies (Fig. 6b). Notably, the median and median absolute deviation are highly similar between species pairs within the Hemiptera and Coleoptera. Meanwhile, the exon metrics within the Diptera support large protein sizes as a drosophilid-specific, rather than dipteran-wide, feature.
Surprisingly, both hemocytin and epimerase—our exemplar genes with many (up to 74) and few exons (3–8 per species), respectively—show independent losses of the same splice sites in Drosophila and Tribolium. One feature these species share is a genome size 2.4–6.0× smaller than in the other species examined here (Fig. 7: red shading). Pairwise comparisons within orders also support this trend, as the beetle and fly species with larger genomes exhibit species-specific gains compared to intron loss in their sister taxa (Fig. 7: red outlines). Thus, genome size seems to positively correlate with intron number. However, lineage is a stronger predictor of gene structure: the coleopteran and dipteran species pairs have highly similar exon size metrics despite differences in genome size (Fig. 6b). A global computational analysis over longer evolutionary distances also supports a link between genome size and intron number in arthropods, although chelicerates and insects may experience different rates of evolutionary change in these features . It will be interesting to see if the correlation with genome size is borne out in other invertebrate taxa.
The selective pressures and mechanisms of intron gain in the Hemiptera will be a challenge to uncover. While median exon size (Fig. 6b) could reflect species-specific nucleosome sizes [75, 76], this does not explain why only the Hemiptera seldom exceed this (Fig. 6b: Q3 quartile). Given the gaps in draft genome assemblies, we remain cautious about interpreting (large) intron lengths but note that many hemipteran introns are too small to have harbored a functional transposase gene (e.g., median intron size of 429 bp, n = 69 introns in hemocytin in Cimex). Such small introns could be consistent with the proliferation of non-autonomous short interspersed nuclear elements (SINEs). However, characterization of such highly divergent non-coding elements would require curated SINE libraries for insects, comparable to those generated for vertebrates and plants [75, 76]. Meanwhile, it appears that hemipteran open reading frames ≥ 160 bp are generally prevented by numerous in-frame stop codons just after the donor splice site. Most stop codons are encoded by the triplet TAA in both Oncopeltus and Cimex (data not shown), although these species’ genomes are not particularly AT rich (Table 1).
Even if introns are small, having gene loci comprised of numerous introns and exons adds to the cost of gene expression in terms of both transcription duration and mRNA processing. One could argue that a gene like hemocytin, which encodes a clotting agent, would require rapid expression in the case of wounding—a common occurrence in adult Cimex females due to the traumatic insemination method of reproduction . Thus, as our molecular understanding of comparative insect and particularly hemipteran biology deepens, we will need to increasingly consider how life history traits are manifest in genomic signatures at the structural level (e.g., Figs. 5, 6, and 7), as well as in terms of protein repertoires (Figs. 3 and 4).
Expansion after a novel lateral gene transfer event in phytophagous bugs
In addition to the need for cuticle repair, traumatic insemination may be responsible for the numerous lateral gene transfer (LGT) events predicted in the bed bug . In contrast, the same pipeline analyses  followed by manual curation predicted very few LGTs in Oncopeltus, which lacks this unusual mating behavior. Here, we have identified 11 strong LGT candidates, and we confirmed the incorporation of bacterial DNA into the milkweed bug genome for all 5 candidates chosen for empirical testing (Additional file 2: Table S2.4). Curiously, we find several LGTs potentially involved in bacterial or plant cell wall metabolism that were acquired from different bacterial sources at different times during hemipteran lineage evolution, including 2 distinct LGTs that are unique to Oncopeltus and implicated in the synthesis of peptidoglycan, a bacterial cell wall constituent (Additional file 1: Supplemental Note 2.2).
Cuticle development, structure, and warning pigmentation
The distinctive cuticle of Oncopeltus is produced through the combined action of genes that encode structural and pigmentation proteins, and the gene products that regulate their secretion at each life stage. Furthermore, the milkweed bug has been a powerful model for endocrine studies of hemimetabolous molting and metamorphosis since the 1960s [22, 80–83]. Therefore, we next focused on the presence and function of genes involved in these processes.
Molting is triggered by the release of ecdysteroids, steroid hormones that are synthesized from cholesterol by cytochrome P450 enzymes of the Halloween family , and we were able to identify these in the Oncopeltus genome (Additional file 1: Supplemental Notes 5.2.b, 5.3.b). From the ecdysone response cascade defined in Drosophila , we identified Oncopeltus orthologs of both early- and late-acting factors, including ecdysteroid hormones and their receptors. It will be interesting to see if the same regulatory relationships are conserved in the context of hemimetabolous molting in Oncopeltus. For example, E75A is required for reactivation of ecdysteroid production during the molt cycle in Drosophila larvae  and likely operates similarly in Oncopeltus, since Of-E75A RNAi prevents fourth-instar nymphs from molting to the fifth instar (H. Kelstrup and L. Riddiford, unpublished data).
Regulators such as Broad initiate the transcription of a large battery of genes that encode the structural components of the cuticle needed at each molt, consistent with our expression analyses (Fig. 2b, c, discussed above). We identified 173 genes encoding putative cuticle structural proteins in Oncopeltus (Additional file 1: Supplemental Note 5.2.c). Similar to other insects, the CPR family, with the RR-1 (soft cuticle), RR-2 (hard cuticle), and unclassifiable types, constituted the largest cuticle protein family. While several protein families are similar in size to those of other insects (CPAP1, CPAP3, and TWDL: Additional file 1: Table S5.12), we found a slight expansion in the Oncopeltus CPF family (Additional file 1: Figure S5.14). For cuticle production, similar to the bed bug and the Asian longhorned beetle [12, 30], we identified a single chitin synthase gene with conserved alternative splice isoforms, which suggests that chitin synthase 2 is a duplication specific to only certain beetle and fly lineages within the Holometabola .
A major characteristic of the milkweed bug is the distinctive red-orange and black aposematic (warning) coloration within the cuticle and epidermis that deters predators (e.g., Figs. 1 and 9, [20, 21]). For black coloration, the melanin synthesis pathway known from holometabolous insects (e.g., [94, 95]) is conserved at the sequence (Additional file 1: Figure S5.15) and functional [96, 97] level in Oncopeltus, supporting conservation in hemimetabolous lineages as well. In contrast, production of the primary warning coloration, pteridine red erythropterin , has not been as extensively studied and remains an open avenue for hemimetabolous research. Pterin pigments are synthesized from GTP through a series of enzymatic reactions . Thus far in Oncopeltus, we could identify orthologs of punch, which encodes a GTP cyclohydrolase , and sepia, which is required for the synthesis of the red eye pigment drosopterin . The bright red color of Oncopeltus eggs may in part reflect chemical protection transmitted parentally . Thus, further identification of pigmentation genes will provide fitness indicators for maternal contributions to developmental success under natural conditions (i.e., the presence of egg predators).
Chemoreception and metabolism in relation to feeding biology
Aposematic pigmentation advertises the fact that toxins in the milkweed diet are incorporated into the insects themselves, a metabolic feat that was independently acquired in Oncopeltus and the monarch butterfly (Danaus plexippus), which shares this food source and body coloration [36, 103]. Moreover, given the fundamental differences between phytophagous, mucivorous, and hematophagous diets, we investigated to what extent differences in feeding ecology across hemipterans are represented in their chemoreceptor and metabolic enzyme repertoires.
Numbers of chemoreceptor genes/proteins per family in selected insect species. In some cases, the number of proteins is higher than the number of genes due to an unusual form of alternative splicing, which is particularly notable for the Oncopeltus GRs. Data are shown for four Hemiptera as well as Drosophila melanogaster, the body louse Pediculus humanus, and the termite Zootermopsis nevadensis [11, 12, 104, 108–110, 168]
Oncopeltus fasciatus 1
Cimex lectularius 1,2
Rhodnius prolixus 1,2
Acyrthosiphon pisum 3
Pediculus humanus 2
Hemipteran ArthropodaCyc database summaries. Overview statistics for the newly created database for Oncopeltus fasciatus (Ofas) in comparison with public databases for Rhodnius prolixus (Rpro), Acyrthosiphon pisum (Apis), and Myzus persicae (Mper) available from . Based on OGS v1.1
Gene set ID
RproC1.1 (Built on RproC1 assembly)
OGS v2.1b (Built on Acyr_2.0 assembly)
Clone G006 v1.0
Clone O v1.0
CycADS database ID
Hemipteran ArthropodaCyc annotations of metabolic genes. Taxonomic abbreviations are as in Table 3
EC1 present in the genome
EC unique to this genome2
EC missing only in this genome2
Amino acid metabolism (KEGG)
EC present in the genome
EC unique to this genome2
EC missing only in this genome2
EC unique to this genome3
EC missing only in this genome3
Other enzymes are also present only in the milkweed bug compared to the other examined hemipterans (Additional file 2: Table S6.12). Like other insects , Oncopeltus retains the ability to degrade tyrosine (Tyr). This pathway was uniquely lost in the aphids, where this essential amino acid is jointly synthesized—and consumed—by the aphid host and its endosymbiotic bacteria [6, 7, 17, 117]. Conversely, a gain specific to the milkweed bug lineage was the duplication of the Na+/K+ ATPase alpha subunits whose amino acid substitutions confer resistance to milkweed cardenolides [36, 118]. In the Oncopeltus genome, we find support for the recent nature of these duplications: the genes encoding subunits ATPα1B and ATPα1C occur as a tandem duplication, notably on a scaffold that also harbors one of the clustered ZF271-like gene expansions (see above).
The integrated genomic and transcriptomic resources presented here for the milkweed bug Oncopeltus fasciatus (Figs. 2 and 5) underpin a number of insights into evolutionary and developmental genomics. Our macroevolutionary comparisons across insect orders, now extended to the hemimetabolous Hemiptera, reveal unexpected patterns of molecular evolution. We also show how hemipteran feeding ecology and suites of related biological characters are reflected in the genome.
The gene structure trends we identified, with lineage predominating over genome size as a predictor and with many intron gains in the hemipteroid lineage (Figs. 6 and 7), offer initial parameters and hypotheses for the Hemiptera, Coleoptera, and Diptera. Such ordinal-level parameters can be evaluated against new species’ data and also inform customized pipelines for automated gene model predictions. At the same time, it will be interesting to explore the ramifications of hemipteroid intron gains, as there are few documented lineages with episodic intron gain . For example, possessing more, small exons may provide greater scope to generate protein modularity via isoforms based on alternative exon usage . Furthermore, with the larger genome sizes and lower gene densities of hemipteroids compared to the well-studied Hymenoptera, it remains open whether hemipteroid gene and intron size may also correlate with recombination rates .
Our analyses also highlight new directions for future experimental research, building on Oncopeltus’s long-standing history as a laboratory model and its active research community in the modern molecular genetics era (e.g., Fig. 9, [25–27]). Functional testing will clarify the roles of genes we have identified as unique to the Hemiptera, including those implicated in chemical protection, bacterial and plant cell wall metabolism, or encoding wholly novel proteins (Figs. 3 and 8, see also Additional file 1: Supplemental Note 2.2). Meanwhile, the prominent and species-specific expansions specifically of ZF271-like zinc fingers (Fig. 4), combined with the absence of the co-repressor KAP-1 in insects, argues for investigation into alternative interaction partners, which could clarify the nature of these zinc fingers’ regulatory roles and their binding targets.
One key output of this study is the generation of a metabolism database for Oncopeltus, contributing to the ArthropodaCyc collection (Table 3). In addition to comparisons with other species (Fig. 10), this database can also serve as a future reference for studies that use Oncopeltus as an ecotoxicology model (e.g., ). While we have primarily focused on feeding ecology in terms of broad comparisons between phytophagy and fluid feeding, Oncopeltus is also poised to support future work on nuances among phytophagous species. Despite its milkweed diet in the wild, the lab strain of Oncopeltus has long been adapted to feed on sunflower seeds, demonstrating a latent capacity for more generalist phytophagy . This potential may also be reflected in a larger gustatory receptor repertoire than would be expected for an obligate specialist feeder (Table 2). Thus, Oncopeltus can serve as a reference species for promiscuously phytophagous pests such as the stink bug. Finally, we have identified a number of key genes implicated in life history trade-offs in Oncopeltus, for traits such as cardenolide tolerance, pigmentation, and plasticity in reproduction under environmental variation. The genome data thus represent an important tool to elucidate the proximate mechanisms of fundamental aspects of life history evolution in both the laboratory and nature.
(More information is available in Additional file 1: Supplemental Notes.)
Milkweed bug strain, rearing, and DNA/RNA extraction
The milkweed bug Oncopeltus fasciatus (Dallas), Carolina Biological Supply strain (Burlington, NC, USA), was maintained in a laboratory colony under standard husbandry conditions (sunflower seed and water diet, 25 °C, 12:12 light-dark photoperiod). Voucher specimens for an adult female (record # ZFMK-TIS-26324) and adult male (record # ZFMK-TIS-26325) have been preserved in ethanol and deposited in the Biobank of the Centre for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Bonn, Germany (https://www.zfmk.de/en/biobank).
Genomic DNA was isolated from individual, dissected adults using the Blood & Cell Culture DNA Midi Kit (G/100) (Qiagen Inc., Valencia, CA, USA). Total RNA was isolated from individual, dissected adults and from pooled, mixed-instar nymphs with TRIzol Reagent (Invitrogen/Thermo Fisher Scientific, Waltham, MA, USA). Dissection improved the accessibility of muscle tissue by disrupting the exoskeleton, and the gut material was removed.
Genome size calculations (flow cytometry, k-mer estimation)
Genome size estimations were obtained by flow cytometry with Hare and Johnston’s protocol . Four to five females and males each from the Carolina Biological Supply lab strain and a wild strain (collected from Athens, GA, USA; GPS coordinates: 33° 56′ 52.8216″ N, 83° 22′ 38.3484″ W) were measured (see also Additional file 1: Supplemental Note 2.1.a). At the bioinformatic level, we attempted to estimate the genome size by k-mer spectrum distribution analysis for a range of k = 15 to 34 counted with Jellyfish 2.1.4  and bbmap , graphing these counts against the frequency of occurrence of k-mers (depth) and calculating genome size based on the coverage at the peak of the distribution (Additional file 1: Supplemental Note 2.1.b).
Genome sequencing, assembly, annotation, and official gene set overview
Library preparation, sequencing, assembly, and automatic gene annotation were conducted at the Baylor College of Medicine Human Genome Sequencing Center (as in [12, 30]). About 1.1 billion 100-bp paired-end reads generated on an Illumina HiSeq2000s machine were assembled using ALLPATHS-LG , from two paired-end (PE) and two mate pair (MP) libraries specifically designed for this algorithm (Additional file 1: Supplemental Note 1). Three libraries were sequenced from an individual adult male (180- and 500-bp PE, 3-kb MP), with the fourth from an individual adult female (8–10-kb MP). The final assembly, “Ofas_1.0” (see metrics in Table 1), has been deposited in GenBank (assembly accession GCA_000696205.1).
Automated annotation of protein-coding genes was performed using a Maker 2.0 annotation pipeline  tuned specifically for arthropods (Additional file 1: Supplemental Note 3). These gene predictions were used as the starting point for manual curation via the Apollo v.1.0.4 web browser interface , and automatic and manual curations were compiled to generate the OGS (see also Additional file 1: Supplemental Note 4). The current version of the gene set, OGS v1.2, is deposited at GenBank as an “annotation-only” update to the Whole Genome Shotgun project (accession JHQO00000000). Here, we describe version JHQO02000000. The annotations can be downloaded from NCBI’s ftp site, ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/696/205/GCA_000696205.1_Ofas_1.0/. The annotations are also available through the i5K Workspace@NAL , https://i5k.nal.usda.gov/data/Arthropoda/oncfas-(Oncopeltus_fasciatus)/Ofas_1.0/2.Official%20or%20Primary%20Gene%20Set/GCA_000696205.1_Ofas_1.0/.
Databases of the genome assembly (definitive Illumina-only: Table 1, Additional file 1: Supplemental Note 1.3; provisional hybrid Illumina-PacBio: see below, Additional file 1: Supplemental Note 2.3), Maker automatic gene predictions (Additional file 1: Supplemental Note 3), and OGS v1.1 (Table 1, Additional file 1: Supplemental Note 4) are available through the i5K Workspace@NAL, and the Ag Data Commons data access system of the US Department of Agriculture’s (USDA) National Agricultural Library as individual citable databases [129–132].
Repeat content analysis
Repetitive regions were identified in the Oncopeltus genome assembly with RepeatModeler Open-1.0.8  based on a species-specific repeat library generated de novo with RECON , RepeatScout , and Tandem Repeats Finder . Then, RepeatMasker Open-4.0  was used to mask the repeat sequences based on the RepeatModeler library. Given the fragmented nature of the assembly, we attempted to fill and close the assembly gaps by sequencing additional material, generating long reads with single molecule real-time sequencing on a PacBio RS II machine (estimated coverage of 8x). Gap filling on the Illumina assembly scaffolds was performed with PBJelly version 13.10.22, and the resulting assembly  was used for repeat content estimation and comparison with Cimex lectularius and Acyrthosiphon pisum (see also Additional file 1: Supplemental Note 2.3).
Total RNA from three distinct life history samples (pooled, mixed-instar nymphs; an adult male; an adult female) was also sequenced on an Illumina HiSeq2000s machine, producing a total of 72 million 100-bp paired-end reads (Additional file 1: Supplemental Note 1.3, Additional file 2: Table S1.1; GenBank Bioproject: PRJNA275739). These expression data were used to support the generation of the OGS at different stages of the project: as input for the evidence-guided automated annotation with Maker 2.0 (Additional file 1: Supplemental Note 3), as expression evidence tracks in the Apollo browser to support the community curation of the OGS, and, once assembled into a de novo transcriptome, as a point of comparison for quality control of the OGS.
The raw RNA-seq reads were pre-processed by filtering out low-quality bases (phred score < 30) and Truseq adapters with Trimmomatic-0.30. Further filtering removed ribosomal and mitochondrial RNA sequences with Bowtie 2 , based on a custom library built with all hemipteran ribosomal and mitochondrial RNA accessions from NCBI as of 7th February 2014 (6069 accessions). The pooled, filtered reads were mapped to the genome assembly with Tophat2-PE on CyVerse . A second set of RNA-seq reads from an earlier study (“published adult” dataset, ) was also filtered and mapped in the same fashion, and both datasets were loaded into the Oncopeltus Apollo instance as evidence tracks (under the track names “pooled RNA-seq - cleaned reads” and “RNA-seq raw PE reads Andolfatto et al”, respectively).
Additionally, a de novo transcriptome was generated from our filtered RNA-seq reads (pooled from all three samples prepared in this study) using Trinity  and TransDecoder  with default parameters. This transcriptome is referred to as “i5K,” to distinguish it from a previously published maternal and early embryonic transcriptome for Oncopeltus (referred to as “454”, ). Both the i5K and 454 transcriptomes were mapped to the genome assembly with GMAP v. 2014-05-15 on CyVerse. These datasets were also loaded into the Apollo browser as evidence tracks to assist in manual curation.
Life history stage-specific and sex-specific expression analyses in hemipteroids
Transcript expression of the OGS v1.1 genes was estimated by running RSEM2  on the filtered RNA-seq datasets for the three i5K postembryonic stages against the OGS v1.1 cDNA dataset. Transcript expression was then based on the transcripts per million (TPM) value. The TPM values were processed by adding a value of 1 (to avoid zeros) and then performing a log2 transformation. The number of expressed genes per RNA-seq library was compared for TPM cutoffs of > 1, > 0.5, and > 0.25. A > 0.25 cutoff was chosen, which reduced the number of expressed genes by 6.6% compared to a preliminary analysis based on a simple cutoff of ≥ 10 mapped reads per transcript, while the other TPM cutoffs were deemed too restrictive (reducing the expressed gene set by > 10%). This analysis was also applied to the “published adult” dataset . To include the embryonic stages in the comparison, transcripts from the 454 transcriptome were used as blastn queries against the OGS v1.1 cDNA dataset (cutoff e value < 10−5). The results from all datasets were converted to a binary format to generate Venn diagrams (Fig. 2b).
Statistically significant sex-specific and developmental stage-specific gene enrichment was determined from RNA-seq datasets according to published methods [143, 144], with modifications. Data from Oncopeltus (see the previous methods section, Bioproject: PRJNA275739) were compared between stages and pairwise with the hemipterans Cimex lectularius, PRJNA275741; Acyrthosiphon pisum, PRJNA209321; and Pachypsylla venusta, PRJNA275248; as well as with the hemipteroid Frankliniella occidentalis (Thysanoptera), PRJNA203209 (see also Fig. 2c, Additional file 1: Supplemental Note 2.4).
Protein orthology assessments via OrthoDB and BUSCO analyses
Global transcription factor identification
Likely transcription factors (TFs) were identified by scanning the amino acid sequences of predicted protein-coding genes for putative DNA binding domains (DBDs), and when possible, the DNA binding specificity of each TF was predicted using established procedures . Briefly, all protein sequences were scanned for putative DBDs using the 81 Pfam  models listed in Weirauch and Hughes  and the HMMER tool , with the recommended detection thresholds of Per-sequence Eval < 0.01 and Per-domain conditional Eval < 0.01. Each protein was classified into a family based on its DBDs and their order in the protein sequence (e.g., bZIPx1, AP2x2, Homeodomain+Pou). The resulting DBD amino acid sequences were then aligned within each family using Clustal Omega , with default settings. For protein pairs with multiple DBDs, each DBD was aligned separately. From these alignments, the sequence identity was calculated for all DBD sequence pairs (i.e., the percent of amino acid residues that are identical across all positions in the alignment). Using previously established sequence identity thresholds for each family , the predicted DNA binding specificities were mapped by simple transfer. For example, the DBD of OFAS001246-RA is 98% identical to the Drosophila melanogaster Bric a Brac 1 (Bab1) protein. Since the DNA binding specificity of Bab1 has already been experimentally determined, and the cutoff for the Pipsqueak family TFs is 85%, we can infer that OFAS001246-RA will have the same binding specificity as Drosophila Bab1.
Double-stranded RNA (dsRNA) was designed to target the final, unique exon of the broad isoforms Z2, Z3, and Z4. A portion of the coding sequence for the zinc finger region from these exons (179 bp, 206 bp, and 216 bp, respectively) was cloned into a plasmid vector and used as a template for in vitro RNA synthesis, using the gene-specific primer pairs: Of-Z2_fwd: 5′-ATGTGGCAGACAAGCATGCT-3′, Of-Z2_rev: 5′-CTAAAATTTGACATCAGTAGGC-3′; Of-Z3_fwd: 5′-CCTTCTCCTGTTACTACTCAC-3′, Of-Z3_rev: 5′-TTATATGGGCGGCTGTCCAA-3′; and Of-Z4_fwd: 5′-AACACTGACCTTGGTTACACA-3′, Of-Z4_rev: 5′-TAGGTGGAGGATTGCTAAAATT-3′. Two separate transcription reactions (one for each strand) were performed using the Ambion MEGAscript kit (Ambion, Austin, TX, USA). The reactions were purified by phenol/chloroform extraction followed by precipitation as described in the MEGAscript protocol. The separate strands were re-annealed in a thermocycler as described previously . Nymphs were injected with a Hamilton syringe fitted with a 32-gauge needle as described . The concentration of Of-Z2, Of-Z3, and Of-Z4 dsRNA was 740 ng/μl, 1400 ng/μl, and 1200 ng/μl, respectively. All nymphs were injected within 8 h of the molt to the fourth (penultimate juvenile) instar (n ≥ 12 per treatment, see Fig. 9). Fore- and hindwings were then dissected from adults and photographed at the same scale as wings from wild type, uninjected controls.
CycADS annotation and OncfaCyc database generation
We used the Cyc Annotation Database System (Cyc-ADS ), an automated annotation management system, to integrate protein annotations from different sources into a Cyc metabolic network reconstruction that was integrated into the ArthropodaCyc database. Using our CycADS pipeline, Oncopeltus fasciatus proteins from the official gene set OGS v1.1 were annotated using different methods—including KAAS , PRIAM , Blast2GO [152, 153], and InterProScan with several approaches —to obtain EC and GO numbers. All annotation information data were collected in the CycADS SQL database and automatically extracted to generate appropriate input files to build or update BioCyc databases  using the Pathway Tools software . The OncfaCyc database, representing the metabolic protein-coding genes of Oncopeltus, was thus generated and is now included in the ArthropodaCyc database, a collection of arthropod metabolic network databases (, http://arthropodacyc.cycadsys.org/).
We thank Dorith Rotenberg (Kansas State University, currently North Carolina State University, USA) and Michael Sparks (Agricultural Research Service, United States Department of Agriculture, USA) for generously making available the unpublished genome assemblies of the fellow hemipteroid i5K species Frankliniella occidentalis and Halyomorpha halys, respectively, for use in specific analyses presented here. Similarly, we thank Hans Kelstrup and Lynn Riddiford (Janelia Farm Research Campus, HHMI, USA) for sharing unpublished data on Of-E75A RNAi. We thank George Coupland (Max Planck Institute for Plant Breeding Research, Cologne, Germany) as well as Lisa Czaja, Kurt Steuber, and Bruno Huettel (Max Planck Genome Centre Cologne, Germany) for conducting the PacBio sequencing and providing support with data handling. We also thank Oliver Niehuis (Albert Ludwig University, Freiburg, Germany) and Alexander Klassmann (University of Cologne, Germany) for discussions on k-mer and gene structure analyses, respectively, Sarah Kingan (University of Rochester, USA) for assistance with LGT phylogenies, as well as Jeanne Wilbrandt (Zoologisches Forschungsmuseum Alexander Koenig, Bonn, Germany) for comments on the manuscript.
Funding for genome sequencing, assembly and automated annotation was provided by the National Institutes of Health (NIH) grant U54 HG003273 (NHGRI) to RAG. The i5K pilot project (https://www.hgsc.bcm.edu/arthropods) assisted in sequencing of the Oncopeltus fasciatus genome. We also acknowledge funding for the project from German Research Foundation (DFG) grants PA 2044/1-1 and SFB 680 project A12 to KAP. Support for specific analyses was provided by the Swiss National Science Foundation with grant 31003A_143936 to EMZ and PP00P3_170664 to RMW; the European Research Council grant ERC-CoG #616346 to AK; DFG grant SFB 680 project A1 to SiR; the National Science Foundation with grant US NSF DEB1257053 to JHW; NIH grant R01GM113230 (NIGMS) to LP; and by NIH grants 5R01GM080203 (NIGMS) and 5R01HG004483 (NHGRI) and by the Director, Office of Science, Office of Basic Energy Sciences, U.S. Department of Energy, Contract No. DE-AC02-05CH11231 to MCMT.
Availability of data and materials
All sequence data are publically available at the NCBI, bioproject number PRJNA229125 and in the USDA Ag Data Commons data access system . In addition, assembled scaffolds, gene models, and a browser are available at the National Agricultural Library ([130–133], https://i5k.nal.usda.gov/Oncopeltus_fasciatus). The OncfaCyc metabolism database is available within the ArthropodaCyc collection (http://arthropodacyc.cycadsys.org/).
KAP and StR conceived the project. KAP managed and coordinated the project. KAP and SK provided the specimens for sequencing and performed the DNA and RNA extractions. StR, SD, SLL, HC, HVD, HD, YH, JQ, SCM, DSTH, KCW, DMM, and RAG constructed the libraries and performed the sequencing. StR, SCM, and DSTH performed the genome assembly and automated gene prediction. IMVJ, JSJ, and PJM analyzed the genome size. IMVJ, VK, PH, and KAP contributed to the repetitive content analyses. AD, RR, JHW, KAP, and SK performed the bacterial scaffold detection and LGT analyses. MCMT developed the Apollo software. KAP, IMVJ, MCMT, CPC, C-YL, and MFP implemented the Apollo-based manual curation. KAP, IMVJ, JBB, DE, YS, HMR, DA, CGCJ, BMIV, EJD, CSB, C-CC, Y-TC, ADC, AGC, AJJC, PKD, EMD, CGE, MF, NG, TH, Y-MH, ECJ, TEJ, JWJ, AK, ML, MRL, H-LL, YL, SRP, LP, MLP, PNR, RR-P, SiR, LS, MES, JS, ES, JNS, OT, LT, MVDZ, SV, and AJR participated in the manual curation and contributed to the supplemental notes. IMVJ, KAP, DSTH, M-JMC, CPC, C-YL, and MFP performed the curation quality control and generated the OGS. IMVJ, KAP, CJH, and JBB generated the de novo transcriptomes and performed the life history stage expression analyses. RMW, PI, KAP, and EMZ performed the orthology and phylogenomic analyses. MTW, KAP, IMVJ, PH, and BMIV performed the transcription factor analyses. EJD conducted the analyses of DNA methylation. KAP, PH, and RJS contributed to the comparative analyses of gene structure. DE conducted the RNAi experiments. SC, PB-P, GF, and NP generated and performed the comparative analyses on the OncfaCyc database. KAP, IMVJ, JBB, DE, YS, SC, HMR, and MTW wrote the manuscript. KAP, IMVJ, JBB, DE, YS, SC, HMR, MFP, RMW, PI, MTW, StR, PJM, and AK edited the manuscript. IMVJ and KAP organized the supplementary materials. All authors approved the final manuscript.
Ethics approval and consent to participate
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Zdobnov EM, Tegenfeldt F, Kuznetsov D, Waterhouse RM, Simão FA, Ioannidis P, Seppey M, Loetscher A, Kriventseva EV. OrthoDB v9.1: cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs. Nucleic Acids Res. 2017;45:D744–9.PubMedGoogle Scholar
- Huang DY, Bechly G, Nel P, Engel MS, Prokop J, Azar D, Cai CY, van de Kamp T, Staniczek AH, Garrouste R, et al. New fossil insect order Permopsocida elucidates major radiation and evolution of suction feeding in hemimetabolous insects (Hexapoda: Acercaria). Sci Rep. 2016;6:23004.PubMedPubMed CentralGoogle Scholar
- Misof B, Liu S, Meusemann K, Peters RS, Donath A, Mayer C, Frandsen PB, Ware J, Flouri T, Beutel RG, et al. Phylogenomics resolves the timing and pattern of insect evolution. Science. 2014;346:763–7.PubMedGoogle Scholar
- Grimaldi D, Engel MS. Evolution of the insects. Cambridge: Cambridge University Press; 2005.Google Scholar
- Panfilio KA, Angelini DR. By land, air, and sea: hemipteran diversity through the genomic lens. Curr Opin Insect Sci. 2018;25:106–15.PubMedGoogle Scholar
- The International Aphid Genomics Consortium. Genome sequence of the pea aphid Acyrthosiphon pisum. PLoS Biol. 2010;8:e1000313.PubMed CentralGoogle Scholar
- Mathers TC, Chen Y, Kaithakottil G, Legeai F, Mugford ST, Baa-Puyoulet P, Bretaudeau A, Clavijo B, Colella S, Collin O, et al. Rapid transcriptional plasticity of duplicated gene clusters enables a clonally reproducing aphid to colonise diverse plant species. Genome Biol. 2017;18:27.PubMedPubMed CentralGoogle Scholar
- Wenger JA, Cassone BJ, Legeai F, Johnston JS, Bansal R, Yates AD, Coates BS, Pavinato VA, Michel A. Whole genome sequence of the soybean aphid, Aphis glycines. Insect Biochem Mol Biol. 2017; in press.Google Scholar
- Sloan DB, Nakabachi A, Richards S, Qu J, Murali SC, Gibbs RA, Moran NA. Parallel histories of horizontal gene transfer facilitated extreme reduction of endosymbiont genomes in sap-feeding insects. Mol Biol Evol. 2014;31:857–71.PubMedPubMed CentralGoogle Scholar
- Xue J, Zhou X, Zhang C-X, Yu L-L, Fan H-W, Wang Z, Xu H-J, Xi Y, Zhu Z-R, Zhou W-W, et al. Genomes of the rice pest brown planthopper and its endosymbionts reveal complex complementary contributions for host adaptation. Genome Biol. 2014;15:521.PubMedPubMed CentralGoogle Scholar
- Mesquita RD, Vionette-Amaral RJ, Lowenberger C, Rivera-Pomar R, Monteiro FA, Minx P, Spieth J, Carvalho AB, Panzera F, Lawson D, et al. Genome of Rhodnius prolixus, an insect vector of Chagas disease, reveals unique adaptations to hematophagy and parasite infection. Proc Natl Acad Sci U S A. 2015;112:14936–41.PubMedPubMed CentralGoogle Scholar
- Benoit JB, Adelman ZN, Reinhardt K, Dolan A, Poelchau M, Jennings EC, Szuter EM, Hagan RW, Gujar H, Shukla JN, et al. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome. Nat Commun. 2016;7:10165.PubMedPubMed CentralGoogle Scholar
- Rosenfeld JA, Reeves D, Brugler MR, Narechania A, Simon S, Durrett R, Foox J, Shianna K, Schatz MC, Gandara J, et al. Genome assembly and geospatial phylogenomics of the bed bug Cimex lectularius. Nat Commun. 2016;7:10164.PubMedPubMed CentralGoogle Scholar
- Sparks ME, Shelby KS, Kuhar D, Gundersen-Rindal DE. Transcriptome of the invasive brown marmorated stink bug, Halyomorpha halys (Stal) (Heteroptera: Pentatomidae). PLoS One. 2014;9:e111646.PubMedPubMed CentralGoogle Scholar
- Ioannidis P, Lu Y, Kumar N, Creasy T, Daugherty S, Chibucos MC, Orvis J, Shetty A, Ott S, Flowers M, et al. Rapid transcriptome sequencing of an invasive pest, the brown marmorated stink bug, Halyomorpha halys. BMC Genomics. 2014;15:738.PubMedPubMed CentralGoogle Scholar
- Li H, Leavengood JM Jr, Chapman EG, Burkhardt D, Song F, Jiang P, Liu J, Zhou X, Cai W. Mitochondrial phylogenomics of Hemiptera reveals adaptive innovations driving the diversification of true bugs. Proc Biol Sci. 2017;284:20171223.Google Scholar
- Wilson ACC, Ashton PD, Charles H, Colella S, Febvay G, Jander G, Kushlan PF, Macdonald SJ, Schwartz JF, Thomas GH, Douglas AE. Genomic insight into the amino acid relations of the pea aphid, Acyrthosiphon pisum, with its symbiotic bacterium Buchnera aphidicola. Insect Mol Biol. 2010;19(Suppl 2):249–58.PubMedGoogle Scholar
- Eichler S, Schaub GA. Development of symbionts in triatomine bugs and the effects of infections with trypanosomatids. Exp Parasitol. 2002;100:17–27.PubMedGoogle Scholar
- Matsuura Y, Kikuchi Y, Hosokawa T, Koga R, Meng X-Y, Kamagata Y, Nikoh N, Fukatsu T. Evolution of symbiotic organs and endosymbionts in lygaeid stinkbugs. The ISME Journal. 2012;6:397–409.PubMedGoogle Scholar
- Berenbaum MR, Miliczky E. Mantids and milkweed bugs - efficacy of aposematic coloration against invertebrate predators. Am Midl Nat. 1984;111:64–8.Google Scholar
- Burdfield-Steel ER, Shuker DM. The evolutionary ecology of the Lygaeidae. Ecol Evol. 2014;4:2278–301.PubMedPubMed CentralGoogle Scholar
- Lawrence PA. Mitosis and the cell cycle in the metamorphic moult of the milkweed bug Oncopeltus fasciatus; a radioautographic study. J Cell Sci. 1968;3:391–404.Google Scholar
- Chipman AD. Oncopeltus fasciatus as an evo-devo research organism. Genesis. 2017;55:e23020.Google Scholar
- Panfilio KA. Late extraembryonic development and its zen-RNAi-induced failure in the milkweed bug Oncopeltus fasciatus. Dev Biol. 2009;333:297–311.PubMedGoogle Scholar
- Panfilio KA, Roth S. Epithelial reorganization events during late extraembryonic development in a hemimetabolous insect. Dev Biol. 2010;340:100–15.PubMedGoogle Scholar
- Sharma AI, Yanes KO, Jin L, Garvey SL, Taha SM, Suzuki Y. The phenotypic plasticity of developmental modules. Evodevo. 2016;7:15.PubMedPubMed CentralGoogle Scholar
- Hughes CL, Kaufman TC. RNAi analysis of Deformed, proboscipedia and Sex combs reduced in the milkweed bug Oncopeltus fasciatus: novel roles for Hox genes in the hemipteran head. Development. 2000;127:3683–94.PubMedGoogle Scholar
- Wolfe SL, John B. The organization and ultrastructure of male meiotic chromosomes in Oncopeltus fasciatus. Chromosoma. 1965;17:85–103.PubMedGoogle Scholar
- Messthaler H, Traut W. Phases of sex chromosome inactivation in Oncopeltus fasciatus and Pyrrhocoris apterus (Insecta, Heteroptera). Caryologia. 1975;28:501–10.Google Scholar
- McKenna DD, Scully ED, Pauchet Y, Hoover K, Kirsch R, Geib SM, Mitchell RF, Waterhouse RM, Ahn SJ, Arsala D, et al. Genome of the Asian longhorned beetle (Anoplophora glabripennis), a globally significant invasive species, reveals key functional and evolutionary innovations at the beetle-plant interface. Genome Biol. 2016;17:227.PubMedPubMed CentralGoogle Scholar
- Simpson JT. Exploring genome characteristics and sequence quality without a reference. Bioinformatics. 2014;30:1228–35.PubMedPubMed CentralGoogle Scholar
- Hanrahan SJ, Johnston JS. New genome size estimates of 134 species of arthropods. Chromosom Res. 2011;19:809–23.Google Scholar
- Panfilio KA, Liu PZ, Akam M, Kaufman TC. Oncopeltus fasciatus zen is essential for serosal tissue function in katatrepsis. Dev Biol. 2006;292:226–43.Google Scholar
- Tian X, Xie Q, Li M, Gao C, Cui Y, Xi L, Bu W. Phylogeny of pentatomomorphan bugs (Hemiptera-Heteroptera:Pentatomomorpha) based on six Hox gene fragments. Zootaxa. 2011;2888:57–68.Google Scholar
- Ewen-Campen B, Shaner N, Panfilio KA, Suzuki Y, Roth S, Extavour CG. The maternal and early embryonic transcriptome of the milkweed bug Oncopeltus fasciatus. BMC Genomics. 2011;12:61.PubMedPubMed CentralGoogle Scholar
- Zhen Y, Aardema ML, Medina EM, Schumer M, Andolfatto P. Parallel molecular evolution in an herbivore community. Science. 2012;337:1634–7.PubMedPubMed CentralGoogle Scholar
- Robertson HM. The insect chemoreceptor superfamily in Drosophila pseudoobscura: molecular evolution of ecologically-relevant genes over 25 million years. J Insect Sci. 2009;9:18.PubMedPubMed CentralGoogle Scholar
- Robertson HM. Taste: independent origins of chemoreception coding systems? Curr Biol. 2001;11:R560–2.PubMedGoogle Scholar
- Jazwinska A, Rushlow C, Roth S. The role of brinker in mediating the graded response to Dpp in early Drosophila embryos. Development. 1999;126:3323–34.PubMedGoogle Scholar
- Togawa T, Dunn WA, Emmons AC, Nagao J, Willis JH. Developmental expression patterns of cuticular protein genes with the R&R Consensus from Anopheles gambiae. Insect Biochem Mol Biol. 2008;38:508–19.PubMedPubMed CentralGoogle Scholar
- Karr TL: Fruit flies and the sperm proteome. Hum Mol Genet 2007, 16 Spec No. 2:R124–R133.Google Scholar
- Waterhouse RM, Seppey M, Simao FA, Manni M, Ioannidis P, Klioutchnikov G, Kriventseva EV, Zdobnov EM. BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol Biol Evol. 2017;35:543–8.Google Scholar
- Shigenobu S, Bickel RD, Brisson JA, Butts T, Chang CC, Christiaens O, Davis GK, Duncan EJ, Ferrier DE, Iga M, et al. Comprehensive survey of developmental genes in the pea aphid, Acyrthosiphon pisum: frequent lineage-specific duplications and losses of developmental genes. Insect Mol Biol. 2010;19(Suppl 2):47–62.PubMedGoogle Scholar
- Kriventseva EV, Tegenfeldt F, Petty TJ, Waterhouse RM, Simão FA, Pozdnyakov IA, Ioannidis P, Zdobnov EM. OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software. Nucl Acids Res. 2015;43:D250–6.PubMedGoogle Scholar
- Bansal R, Michel AP. Core RNAi machinery and Sid1, a component for systemic RNAi, in the hemipteran insect, Aphis glycines. Int J Mol Sci. 2013;14:3786–801.PubMedPubMed CentralGoogle Scholar
- Bao R, Fischer T, Bolognesi R, Brown SJ, Friedrich M. Parallel duplication and partial subfunctionalization of beta-catenin/armadillo during insect evolution. Mol Biol Evol. 2012;29:647–62.PubMedGoogle Scholar
- Sachs L, Chen YT, Drechsler A, Lynch JA, Panfilio KA, Lassig M, Berg J, Roth S. Dynamic BMP signaling polarized by Toll patterns the dorsoventral axis in a hemimetabolous insect. eLife. 2015;4:e05502.PubMedPubMed CentralGoogle Scholar
- Armisen D, Refki PN, Crumiere AJ, Viala S, Toubiana W, Khila A. Predator strike shapes antipredator phenotype through new genetic interactions in water striders. Nat Commun. 2015;6:8153.PubMedPubMed CentralGoogle Scholar
- Konopova B, Smykal V, Jindra M. Common and distinct roles of juvenile hormone signaling genes in metamorphosis of holometabolous and hemimetabolous insects. PLoS One. 2011;6:e28728.PubMedPubMed CentralGoogle Scholar
- Vellichirammal NN, Gupta P, Hall TA, Brisson JA. Ecdysone signaling underlies the pea aphid transgenerational wing polyphenism. Proc Natl Acad Sci U S A. 2017;114:1419–23.PubMedPubMed CentralGoogle Scholar
- Wulff JP, Sierra I, Sterkel M, Holtof M, Van Wielendaele P, Francini F, Broeck JV, Ons S. Orcokinin neuropeptides regulate ecdysis in the hemimetabolous insect Rhodnius prolixus. Insect Biochem Mol Biol. 2017;81:91–102.Google Scholar
- Chiu TL, Wen Z, Rupasinghe SG, Schuler MA. Comparative molecular modeling of Anopheles gambiae CYP6Z1, a mosquito P450 capable of metabolizing DDT. Proc Natl Acad Sci U S A. 2008;105:8855–60.Google Scholar
- Gong Y, Li T, Feng Y, Liu N. The function of two P450s, CYP9M10 and CYP6AA7, in the permethrin resistance of Culex quinquefasciatus. Sci Rep. 2017;7:587.Google Scholar
- Liu PZ, Kaufman TC. hunchback is required for suppression of abdominal identity, and for proper germband growth and segmentation in the intermediate germband insect Oncopeltus fasciatus. Development. 2004;131:1515–27.PubMedGoogle Scholar
- Schaeper ND, Pechmann M, Damen WGM, Prpic N-M, Wimmer EA. Evolutionary plasticity of collier function in head development of diverse arthropods. Dev Biol. 2010;344:363–76.PubMedGoogle Scholar
- Aspiras AC, Smith FW, Angelini DR. Sex-specific gene interactions in the patterning of insect genitalia. Dev Biol. 2011;360:369–80.PubMedGoogle Scholar
- Weirauch MT, Yang A, Albu M, Cote AG, Montenegro-Montero A, Drewe P, Najafabadi HS, Lambert SA, Mann I, Cook K, et al. Determination and inference of eukaryotic transcription factor sequence specificity. Cell. 2014;158:1431–43.PubMedPubMed CentralGoogle Scholar
- Peel AD, Telford MJ, Akam M. The evolution of hexapod engrailed-family genes: evidence for conservation and concerted evolution. Proc Biol Sci. 2006;273:1733–42.PubMedPubMed CentralGoogle Scholar
- Ben-David J, Chipman AD. Mutual regulatory interactions of the trunk gap genes during blastoderm patterning in the hemipteran Oncopeltus fasciatus. Dev Biol. 2010;346:140–9.PubMedGoogle Scholar
- Erezyilmaz DF, Kelstrup HC, Riddiford LM. The nuclear receptor E75A has a novel pair-rule-like function in patterning the milkweed bug, Oncopeltus fasciatus. Dev Biol. 2009;334:300–10.PubMedPubMed CentralGoogle Scholar
- Liu PZ, Kaufman TC. even-skipped is not a pair-rule gene but has segmental and gap-like functions in Oncopeltus fasciatus, an intermediate germband insect. Development. 2005;132:2081–92.PubMedGoogle Scholar
- Weisbrod A, Cohen M, Chipman AD. Evolution of the insect terminal patterning system--insights from the milkweed bug, Oncopeltus fasciatus. Dev Biol. 2013;380:125–31.PubMedGoogle Scholar
- Albertin CB, Simakov O, Mitros T, Wang ZY, Pungor JR, Edsinger-Gonzales E, Brenner S, Ragsdale CW, Rokhsar DS. The octopus genome and the evolution of cephalopod neural and morphological novelties. Nature. 2015;524:220–4.PubMedPubMed CentralGoogle Scholar
- Crooks GE, Hon G, Chandonia J-M, Brenner SE. WebLogo: a sequence logo generator. Genome Res. 2004;14:1188–90.PubMedPubMed CentralGoogle Scholar
- Najafabadi HS, Mnaimneh S, Schmitges FW, Garton M, Lam KN, Yang A, Albu M, Weirauch MT, Radovani E, Kim PM, et al. C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat Biotechnol. 2015;33:555–62.PubMedGoogle Scholar
- Emerson RO, Thomas JH. Adaptive evolution in zinc finger transcription factors. PLoS Genet. 2009;5:e1000325.PubMedPubMed CentralGoogle Scholar
- Thomas JH, Schneider S. Coevolution of retroelements and tandem zinc finger genes. Genome Res. 2011;21:1800–12.PubMedPubMed CentralGoogle Scholar
- Garcia-Perez JL, Widmann TJ, Adams IR. The impact of transposable elements on mammalian development. Development. 2016;143:4101–14.PubMedPubMed CentralGoogle Scholar
- Liu PZ, Kaufman TC. Krüppel is a gap gene in the intermediate insect Oncopeltus fasciatus and is required for development of both blastoderm and germband-derived segments. Development. 2004;131:4567–79.PubMedGoogle Scholar
- Heger P, Marin B, Bartkuhn M, Schierenberg E, Wiehe T. The chromatin insulator CTCF and the emergence of metazoan diversity. Proc Natl Acad Sci U S A. 2012;109:17507–12.PubMedPubMed CentralGoogle Scholar
- Liu H, Chang L-H, Sun Y, Lu X, Stubbs L. Deep vertebrate roots for mammalian zinc finger transcription factor subfamilies. Genome Biol Evol. 2014;6:510–25.PubMedPubMed CentralGoogle Scholar
- Imbeault M, Helleboid P-Y, Trono D. KRAB zinc-finger proteins contribute to the evolution of gene regulatory networks. Nature. 2017;543:550–4.PubMedGoogle Scholar
- Csuros M, Rogozin IB, Koonin EV. A detailed history of intron-rich eukaryotic ancestors inferred from a global survey of 100 complete genomes. PLoS Comput Biol. 2011;7:e1002150.PubMedPubMed CentralGoogle Scholar
- Hoy MA, Waterhouse RM, Wu K, Estep AS, Ioannidis P, Palmer WJ, Pomerantz AF, Simao FA, Thomas J, Jiggins FM, et al. Genome sequencing of the phytoseiid predatory mite Metaseiulus occidentalis reveals completely atomized Hox genes and superdynamic intron evolution. Genome Biol Evol. 2016;8:1762–75.PubMedPubMed CentralGoogle Scholar
- Seibt KM, Wenke T, Muders K, Truberg B, Schmidt T. Short interspersed nuclear elements (SINEs) are abundant in Solanaceae and have a family-specific impact on gene structure and genome organization. Plant J. 2016;86:268–85.PubMedGoogle Scholar
- Huff JT, Zilberman D, Roy SW. Mechanism for DNA transposons to generate introns on genomic scales. Nature. 2016;538:533–6.PubMedPubMed CentralGoogle Scholar
- Wheeler D, Redding AJ, Werren JH. Characterization of an ancient lepidopteran lateral gene transfer. PLoS One. 2012;8:e59262.Google Scholar
- Da Lage JL, Binder M, Hua-Van A, Janecek S, Casane D. Gene make-up: rapid and massive intron gains after horizontal transfer of a bacterial alpha-amylase gene to Basidiomycetes. BMC Evol Biol. 2013;13:40.PubMedPubMed CentralGoogle Scholar
- Lee DH, Short BD, Joseph SV, Bergh JC, Leskey TC. Review of the biology, ecology, and management of Halyomorpha halys (Hemiptera: Pentatomidae) in China, Japan, and the Republic of Korea. Environ Entomol. 2013;42:627–41.PubMedGoogle Scholar
- Lawrence PA. Cellular differentiation and pattern formation during metamorphosis of the milkweed bug Oncopeltus. Dev Biol. 1969;19:12–40.Google Scholar
- Riddiford LM. Prevention of metamorphosis by exposure of insect eggs to juvenile hormone analogs. Science. 1970;167:287.PubMedGoogle Scholar
- Willis JH, Lawrence PA. Deferred action of juvenile hormone. Nature. 1970;225:81–3.Google Scholar
- Masner P, Bowers WS, Kalin M, Muhle T. Effect of precocene II on the endocrine regulation of development and reproduction in the bug, Oncopeltus fasciatus. Gen Comp Endocrinol. 1979;37:156–66.PubMedGoogle Scholar
- Rewitz K, O’Connor M, Gilbert L. Molecular evolution of the insect Halloween family of cytochrome P450s: phylogeny, gene organization and functional conservation. Insect Biochem Mol Biol. 2007;37:741–53.PubMedGoogle Scholar
- Huet F, Ruiz C, Richards G. Sequential gene activation by ecdysone in Drosophila melanogaster: the hierarchical equivalence of early and early late genes. Development. 1995;121:1195–204.Google Scholar
- Bialecki M, Shilton A, Fichtenberg C, Segraves WA, Thummel CS. Loss of the ecdysteroid-inducible E75A orphan nuclear receptor uncouples molting from metamorphosis in Drosophila. Dev Cell. 2002;3:209–20.Google Scholar
- Charles JP, Iwema T, Epa VC, Takaki K, Rynes J, Jindra M. Ligand-binding properties of a juvenile hormone receptor, methoprene-tolerant. Proc Natl Acad Sci U S A. 2011;108:21128–33.PubMedPubMed CentralGoogle Scholar
- Minakuchi C, Zhou X, Riddiford L. Kruppel homolog 1 (Kr-h1) mediates juvenile hormone action during metamorphosis of Drosophila melanogaster. Mech Dev. 2008;125:91–105.Google Scholar
- Minakuchi C, Namiki T, Shinoda T. Kruppel homolog 1, an early juvenile hormone-response gene downstream of methoprene-tolerant, mediates its anti-metamorphic action in the red flour beetle Tribolium castaneum. Dev Biol. 2009;352:341–50.Google Scholar
- DiBello PR, Withers DA, Bayer CA, Fristrom JW, Guild GM. The Drosophila Broad-Complex encodes a family of related proteins containing zinc fingers. Genetics. 1991;129:385–97.Google Scholar
- Karim F, Guild G, Thummel C. The Drosophila Broad-Complex plays a key role in controlling ecdysone-regulated gene expression at the onset of metamorphosis. Development. 1993;118:977–88.Google Scholar
- Erezyilmaz DF, Riddiford LM, Truman JW. The pupal specifier broad directs progressive morphogenesis in a direct-developing insect. Proc Natl Acad Sci U S A. 2006;103:6925–30.PubMedPubMed CentralGoogle Scholar
- Arakane Y, Hogenkamp DG, Zhu YC, Kramer KJ, Specht CA, Beeman RW, Kanost MR, Muthukrishnan S. Characterization of two chitin synthase genes of the red flour beetle, Tribolium castaneum, and alternate exon usage in one of the genes during development. Insect Biochem Mol Biol. 2004;34:291–304.PubMedGoogle Scholar
- True JR. Insect melanism: the molecules matter. Trends Ecol Evol. 2003;18:640–7.Google Scholar
- Zhan SA, Guo QH, Li MH, Li MW, Li JY, Miao XX, Huang YP. Disruption of an N-acetyltransferase gene in the silkworm reveals a novel role in pigmentation. Development. 2010;137:4083–90.PubMedGoogle Scholar
- Liu J, Lemonds TR, Popadic A. The genetic control of aposematic black pigmentation in hemimetabolous insects: insights from Oncopeltus fasciatus. Evol Dev. 2014;16:270–7.Google Scholar
- Liu J, Lemonds TR, Marden JH, Popadic A. A pathway analysis of melanin patterning in a hemimetabolous insect. Genetics. 2016;203:403–13.PubMedPubMed CentralGoogle Scholar
- Lawrence PA. Some new mutants of large milkweed bug Oncopeltus fasciatus Dall. Genet Res. 1970;15:347–50.Google Scholar
- Morgan ED. Biosynthesis in insects: advanced edition. London: Royal Society of Chemistry; 2010.Google Scholar
- McLean JR, Krishnakumar S, O’Donnell JM. Multiple mRNAs from the Punch locus of Drosophila melanogaster encode isoforms of GTP cyclohydrolase I with distinct N-terminal domains. J Biol Chem. 1993;268:27191–7.Google Scholar
- Wiederrecht GJ, Paton DR, Brown GM. Enzymatic conversion of Dihydroneopterin triphosphate to the pyrimidodiazepine intermediate involved in the biosynthesis of the Drosopterins in Drosophila melanogaster. J Biol Chem. 1984;259:2195–200.Google Scholar
- Newcombe D, Blount JD, Mitchell C, Moore AJ. Chemical egg defence in the large milkweed bug, Oncopeltus fasciatus, derives from maternal but not paternal diet. Entomologia Experimentalis et Applicata. 2013;149:197–205.Google Scholar
- Zhan S, Merlin C, Boore JL, Reppert SM. The monarch butterfly genome yields insights into long-distance migration. Cell. 2011;147:1171–85.PubMedPubMed CentralGoogle Scholar
- Robertson HM, Warr CG, Carlson JR. Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster. Proc Natl Acad Sci U S A. 2003;100(Suppl 2):14537–42.Google Scholar
- Joseph RM, Carlson JR. Drosophila chemoreceptors: a molecular interface between the chemical world and the brain. Trends Genet. 2015;31:683–95.Google Scholar
- Benton R. Multigene family evolution: perspectives from insect chemoreceptors. Trends Ecol Evol. 2015;30:590–600.PubMedGoogle Scholar
- Rytz R, Croset V, Benton R. Ionotropic receptors (IRs): chemosensory ionotropic glutamate receptors in Drosophila and beyond. Insect Biochem Mol Biol. 2013;43:888–97.Google Scholar
- Kirkness EF, Haas BJ, Sun W, Braig HR, Perotti MA, Clark JM, Lee SH, Robertson HM, Kennedy RC, Elhaik E, et al. Genome sequences of the human body louse and its primary endosymbiont provide insights into the permanent parasitic lifestyle. Proc Natl Acad Sci U S A. 2010;107:12168–73.PubMedPubMed CentralGoogle Scholar
- Smadja C, Shi P, Butlin RK, Robertson HM. Large gene family expansions and adaptive evolution for odorant and gustatory receptors in the pea aphid, Acyrthosiphon pisum. Mol Biol Evol. 2009;26:2073–86.Google Scholar
- Terrapon N, Li C, Robertson HM, Ji L, Meng X, Booth W, Chen Z, Childers CP, Glastad KM, Gokhale K, et al. Molecular traces of alternative social organization in a termite genome. Nat Commun. 2014;5:3636.PubMedGoogle Scholar
- Xu W, Papanicolaou A, Zhang HJ, Anderson A. Expansion of a bitter taste receptor family in a polyphagous insect herbivore. Sci Rep. 2016;6:23666.PubMedPubMed CentralGoogle Scholar
- Feir D. Oncopeltus fasciatus: a research animal. Annu Rev Entomol. 1974;19:81–96.Google Scholar
- Vellozo AF, Véron AS, Baa-Puyoulet P, Huerta-Cepas J, Cottret L, Febvay G, Calevro F, Rahbe Y, Douglas AE, Gabaldón T, et al. CycADS: an annotation database system to ease the development and update of BioCyc databases. Database. 2011;2011:bar008.PubMedPubMed CentralGoogle Scholar
- Baa-Puyoulet P, Parisot N, Febvay G, Huerta-Cepas J, Vellozo AF, Gabaldón T, Calevro F, Charles H, Colella S. ArthropodaCyc: a CycADS powered collection of BioCyc databases to analyse and compare metabolism of arthropods. Database (Oxford). 2016;2016:baw081.Google Scholar
- Hojilla-Evangelista MP, Evangelista RL. Characterization of milkweed (Asclepias spp.) seed proteins. Ind Crops Prod. 2009;29: 275–80.Google Scholar
- Dean CAE, Teets NM, Koštál V, Šimek P, Denlinger DL. Enhanced stress responses and metabolic adjustments linked to diapause and onset of migration in the large milkweed bug Oncopeltus fasciatus. Physiol Entomol. 2016;41:152-61.Google Scholar
- Rabatel A, Febvay G, Gaget K, Duport G, Baa-Puyoulet P, Sapountzis P, Bendridi N, Rey M, Rahbé Y, Charles H, et al. Tyrosine pathway regulation is host-mediated in the pea aphid symbiosis during late embryonic and early larval development. BMC Genomics. 2013;14:235.PubMedPubMed CentralGoogle Scholar
- Dobler S, Petschenka G, Wagschal V, Flacht L. Convergent adaptive evolution – how insects master the challenge of cardiac glycoside-containing host plants. Entomologia Experimentalis et Applicata. 2015;157:30–9.Google Scholar
- Grau-Bove X, Ruiz-Trillo I, Irimia M. Origin of exon skipping-rich transcriptomes in animals driven by evolution of gene architecture. Genome Biol. 2018;19:135.PubMedPubMed CentralGoogle Scholar
- Niehuis O, Gibson JD, Rosenberg MS, Pannebakker BA, Koevoets T, Judson AK, Desjardins CA, Kennedy K, Duggan D, Beukeboom LW, et al. Recombination and its impact on the genome of the haplodiploid parasitoid wasp Nasonia. PLoS One. 2010;5:e8597.Google Scholar
- Ferrero A, Torreblanca A, Garcera MD. Assessment of the effects of orally administered ferrous sulfate on Oncopeltus fasciatus (Heteroptera: Lygaeidae). Environ Sci Pollut Res Int. 2017;24:8551–61.Google Scholar
- Hare EE, Johnston JS. Genome size determination using flow cytometry of propidium iodide-stained nuclei. Methods Mol Biol. 2011;772:3–12.PubMedGoogle Scholar
- Marcais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27:764–70.PubMedPubMed CentralGoogle Scholar
- Bushnell B. BBMap short read aligner; 2016.Google Scholar
- Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe T, Hall G, Shea TP, Sykes S, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A. 2011;108:1513–8.PubMedGoogle Scholar
- Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 2011;12:491.PubMedPubMed CentralGoogle Scholar
- Lee E, Helt G, Reese J, Munoz-Torres M, Childers C, Buels R, Stein L, Holmes I, Elsik C, Lewis S. Web Apollo: a web-based genomic annotation editing platform. Genome Biol. 2013;14:R93.Google Scholar
- Poelchau M, Childers C, Moore G, Tsavatapalli V, Evans J, Lee CY, Lin H, Lin JW, Hackett K. The i5k Workspace@NAL--enabling genomic data access, visualization and curation of arthropod genomes. Nucleic Acids Res. 2015;43:D714–9.PubMedGoogle Scholar
- Murali SC, The i5k genome assembly team (29 additional authors), Han Y, Richards S, Worley K, Muzny D, Gibbs R, Koelzer S, Panfilio KA: Oncopeltus fasciatus genome assembly 1.0. Ag Data Commons (Database) 2015:https://doi.org/10.15482/USDA.ADC/1173238.
- Hughes DST, Koelzer S, Panfilio KA, Richards S: Oncopeltus fasciatus genome annotations v0.5.3. Ag Data Commons (Database) 2015:https://doi.org/10.15482/USDA.ADC/1173237.
- Vargas Jentzsch IM, Hughes DST, Poelchau M, Robertson HM, Benoit JB, Rosendale AJ, Armisén D, Duncan EJ, Vreede BMI, Jacobs CGC, et al: Oncopeltus fasciatus Official Gene Set v1.1. Ag Data Commons (Database) 2015:https://doi.org/10.15482/USDA.ADC/1173142.
- Vargas Jentzsch IM, Kovacova V, Stueber K, Koelzer S, Panfilio KA: Oncopeltus fasciatus hybrid genome assembly 1.0. Ag Data Commons (Database) 2019:https://doi.org/10.15482/USDA.ADC/1503405.
- RepeatModeler Open-1.0.8 [http://www.repeatmasker.org]. Accessed 5 June 2015.
- Bao Z, Eddy SR. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 2002;12:1269–76.PubMedPubMed CentralGoogle Scholar
- Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics. 2005;21(Suppl 1):i351–8.PubMedGoogle Scholar
- Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573–80.PubMedPubMed CentralGoogle Scholar
- RepeatMasker Open-4.0. [http://www.repeatmasker.org]. Accessed 5 June 2015.
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.PubMedPubMed CentralGoogle Scholar
- Goff S, Vaughn M, McKay S, Lyons E, Stapleton A, Gessler D, Matasci N, Wang L, Hanlon M, Lenards A, et al. The iPlant Collaborative: cyberinfrastructure for plant biology. Front Plant Sci. 2011;2:34.Google Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29:644–52.PubMedPubMed CentralGoogle Scholar
- Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8:1494–512.Google Scholar
- Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 2011;12:323.PubMedPubMed CentralGoogle Scholar
- Schoville SD, Chen YH, Andersson MN, Benoit JB, Bhandari A, Bowsher JH, Brevik K, Cappelle K, Chen MM, Childers AK, et al. A model species for agricultural pest genomics: the genome of the Colorado potato beetle, Leptinotarsa decemlineata (Coleoptera: Chrysomelidae). Sci Rep. 2018;8:1931.PubMedPubMed CentralGoogle Scholar
- Scolari F, Benoit JB, Michalkova V, Aksoy E, Takac P, Abd-Alla AM, Malacrida AR, Aksoy S, Attardo GM. The spermatophore in Glossina morsitans morsitans: insights into male contributions to reproduction. Sci Rep. 2016;6:20334.PubMedPubMed CentralGoogle Scholar
- Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–2.PubMedGoogle Scholar
- Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, et al. The Pfam protein families database. Nucleic Acids Res. 2010;38:D211–22.PubMedGoogle Scholar
- Weirauch MT, Hughes TR. A catalogue of eukaryotic transcription factor types, their evolutionary origin, and species distribution. Subcell Biochem. 2011;52:25–73.PubMedGoogle Scholar
- Eddy SR. A new generation of homology search tools based on probabilistic inference. Genome Inform. 2009;23:205–11.Google Scholar
- Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Soding J, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011;7:539.PubMedPubMed CentralGoogle Scholar
- Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007;35:W182–5.PubMedPubMed CentralGoogle Scholar
- Claudel-Renard C, Chevalet C, Faraut T, Kahn D. Enzyme-specific profiles for genome annotation: PRIAM. Nucleic Acids Res. 2003;31:6633–9.PubMedPubMed CentralGoogle Scholar
- Conesa A, Götz S. Blast2GO: a comprehensive suite for functional analysis in plant genomics. Int J Plant Genomics. 2008;2008:619832.PubMedGoogle Scholar
- Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics (Oxford, England). 2005;21:3674–6.Google Scholar
- Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014;30:1236–40.PubMedPubMed CentralGoogle Scholar
- Karp PD, Ouzounis CA, Moore-Kochlacs C, Goldovsky L, Kaipa P, Ahrén D, Tsoka S, Darzentas N, Kunin V, López-Bigas N. Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res. 2005;33:6083–9.PubMedPubMed CentralGoogle Scholar
- Karp PD, Paley SM, Krummenacker M, Latendresse M, Dale JM, Lee TJ, Kaipa P, Gilham F, Spaulding A, Popescu L, et al. Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology. Brief Bioinform. 2010;11:40–79.PubMedGoogle Scholar
- Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F, Dufayard JF, Guindon S, Lefort V, Lescot M, et al. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008;36:W465–9.PubMedPubMed CentralGoogle Scholar
- Wang X, Fang X, Yang P, Jiang X, Jiang F, Zhao D, Li B, Cui F, Wei J, Ma C, et al. The locust genome provides insight into swarm formation and long-distance flight. Nat Commun. 2014;5:2957.PubMedPubMed CentralGoogle Scholar
- The International Silkworm Genome Consortium. The genome of a lepidopteran model insect, the silkworm Bombyx mori. Insect Biochem Mol Biol. 2008;38:1036–45.Google Scholar
- Elsik CG, Worley KC, Bennett AK, Beye M, Camara F, Childers CP, de Graaf DC, Debyser G, Deng J, Devreese B, et al. Finding the missing honey bee genes: lessons learned from a genome upgrade. BMC Genomics. 2014;15:86.PubMedPubMed CentralGoogle Scholar
- Honeybee Genome Sequencing Consortium. Insights into social insects from the genome of the honeybee Apis mellifera. Nature. 2006;443:931–49.Google Scholar
- Richards S, Gibbs RA, Weinstock GM, Brown SJ, Denell R, Beeman RW, Gibbs R, Bucher G, Friedrich M, Grimmelikhuijzen CJ, et al. The genome of the model beetle and pest Tribolium castaneum. Nature. 2008;452:949–55.Google Scholar
- Chen XG, Jiang X, Gu J, Xu M, Wu Y, Deng Y, Zhang C, Bonizzoni M, Dermauw W, Vontas J, et al. Genome sequence of the Asian tiger mosquito, Aedes albopictus, reveals insights into its biology, genetics, and evolution. Proc Natl Acad Sci U S A. 2015;112:E5907–15.PubMedPubMed CentralGoogle Scholar
- Armisen D, Rajakumar R, Friedrich M, Benoit JB, Robertson HM, Panfilio KA, Ahn S-J, Poelchau MF, Chao H, Dinh H, et al: The genome of the water strider Gerris buenoi reveals expansions of gene repertoires associated with adaptations to life on the water. BMC Genomics in press:acceptance e-mail 12 Oct. 2018.Google Scholar
- Papanicolaou A, Schetelig MF, Arensburger P, Atkinson PW, Benoit JB, Bourtzis K, Castañera P, Cavanaugh JP, Chao H, Childers C, et al. The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedemann), reveals insights into the biology and adaptive evolution of a highly invasive pest species. Genome Biol. 2016;17:192.PubMedPubMed CentralGoogle Scholar
- Ellis LL, Huang W, Quinn AM, Ahuja A, Alfrejd B, Gomez FE, Hjelmen CE, Moore KL, Mackay TF, Johnston JS, Tarone AM. Intrapopulation genome size variation in D. melanogaster reflects life history variation and plasticity. PLoS Genet. 2014;10:e1004522.Google Scholar
- Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33:1870–4.PubMedPubMed CentralGoogle Scholar
- Croset V, Rytz R, Cummins SF, Budd A, Brawand D, Kaessmann H, Gibson TJ, Benton R. Ancient protostome origin of chemosensory ionotropic glutamate receptors and the evolution of insect taste and olfaction. PLoS Genet. 2010;6:e1001064.PubMedPubMed CentralGoogle Scholar