The promise of genomics in the study of plant-pollinator interactions
© BioMed Central Ltd 2013
Published: 24 June 2013
Skip to main content
© BioMed Central Ltd 2013
Published: 24 June 2013
Flowers exist in exceedingly complex fitness landscapes, in which subtle variation in each trait can affect the pollinators, herbivores and pleiotropically linked traits in other plant tissues. A whole-genome approach to flower evolution will help our understanding of plant-pollinator interactions.
Plants are immobile, and therefore they cannot choose their own sex partners. To cope with this handicap, several lineages of plants have invented a trick to facilitate the directed transfer of gametes: harnessing animals as go-betweens. This is known in some gymnosperms and even mosses [1–4], but it is in the angiosperms where animal pollination is most commonly encountered and where, in many cases, it has evolved exquisite complexity. This means that the morphology of angiosperm sex organs (flowers), rather than fitting those of the opposite sex, must generate a lock-and-key fit with the animals that visit them. These visitors do not normally come for sex; instead they are paid for their services, typically by means of sugary nectar or surplus pollen. Plants advertise these rewards with showy displays to assist pollinators in finding them [5, 6].
However, there are peculiar challenges that come with such an unusual sex life. Many animals, such as bees, butterflies, flies, birds and bats, might be opportunistically interested in nectar carbohydrates, and plants cannot know or see which ones are in the vicinity, nor can they accept or reject a visitor. They can only offer their commodities to a diverse army of potential visitors from where they stand, and try their luck. Flower structures can be used to limit the type of visitor to some extent [7, 8], but such limitation is typically not absolute . In addition, flowers can attempt to obtain some pollinator specificity by using advertising signals that appeal only to certain pollinators and not others. Colors, patterns, scents and even acoustic or electrostatic cues are all known to affect the behavior of different pollinators in different ways [10–18]. The most important factor that determines a pollinator's preference is often individual experience rather than innate predisposition [5, 6]. Current genomic approaches will help verify this supposition, which, if true, should in some cases reveal relatively weak correlations between flower genotype and pollinator affinities. In terms of pollinators, a genomic approach can reveal the factors that enable flower visitors' ability to be generalists, such as the number and diversity of olfactory receptor genes, or genes for learning and memory .
A further challenge for plants is that some floral traits that are attractive to pollinators can also be of interest to herbivores , and in some cases, flowering plants may have the dual problem of attracting pollinators while deterring the same species in larval stages . Finally, many flower traits are subject to extensive pleiotropies; for example, pigments that contribute to flower coloration are also used in other parts of the plant where they can have multiple important functions [22, 23]. Flowers thus exist in exceedingly complex fitness landscapes, in which a large number of traits might all affect individual success, as well as the dynamics of speciation and plant-pollinator coevolution [24, 25]. In this sense a genomic approach to understanding floral evolution has tremendous potential to move from the traditional question 'is this gene important?' (which carries the risk of generating a self-fulfilling prophecy) to a data-driven approach asking which genes and in which combinations are important and in what ways [26, 27].
Paramount among these later approaches are genome-wide association studies (GWASs) and genomic selection (GS), which work to find correlations between genetic markers (such as single nucleotide polymorphisms (SNPs) and randomly amplified polymorphic (RAD) sequences) in linkage disequilibrium with phenotypic traits, and these approaches are being applied in both plant and animal breeding (for example, in maize [28, 29], oil seed rape  and cows ). A GWAS compared 107 distinct phenotypes of Arabidopsis thaliana, revealing alleles of major effect for follow-up research . Although A. thaliana is primarily self pollinating, comparisons between self- and insect-pollinated plants using these techniques, particularly when closely related, may reveal the genomic architecture important to plant-animal interactions. Several approaches in use for assessing genomic variability do not require de novo genome assembly; for example, sequenced restriction-site-associated DNA (sRAD) uses next-generation sequencing (NGS) approaches to find markers (frequently SNPs) for genotyping [33, 34] or RNA-seq (alternatively called whole-transcriptome shotgun sequencing) to assess transcription profiles  across tissues or life stages. These approaches, whether applied to expressed gene regions or whole genomes, can accurately quantify genomic variation and help us to understand how much variation is available for selection to act on. The results may also have implications for conservation genetics, such as the question of which population is more variable and thus contributes most to the preservation of diversity.
We are still at the early stages of studying the interactions between plants and their pollinators using genomics-based approaches. In plant biology, most resources were initially devoted to sequencing genomes of autogamous (self-fertilizing) model species (the first being A. thaliana, with a small genome size ) or commercially important crop plants, some of which are not insect-pollinated [36, 37]. Similarly, resources used to study insects were initially focused on species with pathogenic impacts on humans or crops, or those serving as model species. For the most part, species interactions between plants and animals have remained the domain of traditional genetics and quantitative trait mapping in combination with experiments in animal behavior. However, the genomes of several animal-pollinated plants, including some that have been models in evolutionary ecology, are now available or in progress, such as the bee-pollinated tomato Solanum sp. [38, 39], the morning glory Ipomoea sp. , Petunia sp. , and Mimulus sp. . These genera contain species with highly diverse ecology, and are pollinated by birds, bees and moths , which should therefore enable comprehensive insights into the genetic architecture underpinning floral traits that address different types of pollinators.
Today, the variety of tools available for genome-level analysis and the decreasing costs of NGS technologies open the door for the large-scale analysis of non-model organisms. The 5,000 insect genome project  and the 1000 plant transcriptome project  may have great potential for the understanding of pollinator-plant evolution. The scale of these projects reflects the orders of magnitude reduction in costs of sequencing, and they include the majority of model and non-model organisms of active interest around the world, including representatives from more than half (over 250) of all angiosperm plant families, and multiple representatives from all (approximately 30) major insect orders. These projects will enable questions to be addressed that cannot be achieved in studies targeted at individual species, and they have the potential for unprecedented insights into the phylogeny, evolution and patterns of diversification of angiosperms and their pollinators. For example, what is the nature of genetic change in plant speciation ? What is the relationship between prezygotic reproductive isolation via flower traits and associated pollinator behavior ? Did bee pollinators co-evolve and co-diversify with angiosperm flowers , and did butterfly diversity emerge much later ? In what ways do floral traits affect rates of plant species diversification, for example traits controlling bilateral floral symmetry and the morphology of nectar spurs ? Is the independent emergence of certain floral traits in different lineages (such as in transitions from predominantly bee-pollinated ancestors to principally hummingbird-pollinated flowers) mediated typically by parallel evolution (variation in homologous genes or indeed homologous mutations) or by convergence using different molecular-genetic pathways with similar phenotypic outcome ? Genome-wide studies using NGS approaches, combined with data mining and new statistical tests, will provide new insights into these questions, insights that would have been impossible using more traditional genetic approaches.
The possibility of sequencing genomes of multiple individuals of a plant population allows the quantification of intraspecific genetic variation for a large number of floral traits, and thus the raw material on which selection can act [26, 27]. Moreover, comparing extant genome sequences with those from ancient plant material (for example, that conserved by permafrost ) will enable the monitoring of genomic changes across populations over time and the quantification of the relative contributions of new mutations and existing genetic variation to evolutionary change, and will help us to link these genomic changes to abiotic (for example, climate change) and biotic factors.
Here, we provide a historical perspective and a future outlook on the molecular and genomic basis of plant-pollinator interactions. We review what has been learned from traditional genetic approaches and the genomic search for quantitative trait loci (QTLs), the potential for new genomic approaches to document plant-pollinator interactions, and the application of these studies to understand the evolutionary and ecological mechanisms governing these interactions.
It is important to emphasize, however, that studies on pollinator 'attraction' or 'preference' for certain floral phenotypes should not be conducted in field conditions, because the preferred type of flower will most likely just be the one that is most similar to those experienced by the pollinators before the start of the experiment. For this reason, it is necessary to experiment with laboratory-reared pollinators that have no previous exposure to natural flowers before a preference test begins . Using this approach on wild-type snapdragons (Antirrhinum majus) and various mutants that affect flower visual appearance and morphology, it has been possible to disentangle how visitation frequencies of various flower types emerge as a complex interplay between pollinator innate bias, flower detectability and learned preference [10, 64].
Although many floral traits are controlled by a single locus, many more are multi-locus traits. Some floral scent cues that influence pollinators may be affected by a relatively small number of genetic loci, detected by screening for QTLs. For example, some floral traits in Petunia species are controlled by several loci of relatively large effect. F2 plants from crosses between P. axillaris (which produces scent cues) with P. exserta (which has no scent) led to the identification of two QTLs, one on chromosome II and another on chromosome VII, that together explained 60% of the variation in the production of benzenoid volatiles between the two species . QTLs can, of course, correspond with large genetic intervals containing tens or even hundreds of genes, and so it is important to pinpoint the exact genes (and their number) to understand the ease with which an evolutionary transition between, for example, a scented and a scentless flower type can occur. To do that, the authors  developed introgression lines for high resolution mapping and localized one gene, ODORANT1, which encodes a MYB-type transcription factor that mapped to the QTL on chromosome VII. The breeding lines were then used to demonstrate that hawkmoths prefer scented plants, particularly when given choices between plants spaced over short distances . QTLs have also been associated with many other aspects of pollination syndromes in P. axillaris (pollinated by nocturnal hawkmoths) and P. integrifolia (pollinated by diurnal bees), including corolla length, nectar volume, style/stamen arrangement and fragrance .
Many traits show continuous phenotypic distributions and environmental interactions, with examples in the shape and size of petals, corolla tubes, stamens and pistils, as well as placement and arrangement of anthers and stamens, nectar volume, and the number of ovules and pollen grains . Frequently, traits are controlled not by a single or a few genes, but by multiple genes across the genome, each with small effects. To detect the causes and consequences of these effects, the genome of Petunia will be very useful because molecular markers (for example SNPs) can be mapped to the scaffold for GWAS studies. NGS approaches, analyzing many thousands of short sequences from across the genome or from the transcriptome, offer huge potential for GS, because they enable phenotype prediction by examining the combination of genetic markers that most strongly correlates with phenotype [28, 31]. The method requires finding associations between genome-wide markers and a trait of interest in a population of individuals with known phenotype and then applying the derived knowledge to predict phenotype in new individuals . The approach has already proved successful in enhancing milk production in cows, and lines have been adopted by the dairy industry worldwide , although the statistical approaches are currently limited to analyses of markers in populations of highly related individuals. For pollinator-floral evolution questions, the future will require analyses of more divergent populations, and for this the statistics will need to be extended.
In some cases, small genetic changes controlling floral traits not only generate a shift in the principal pollinators that visit the flowers, but also in the unwanted visitors, such as nectar-thieving ants . In addition to sugars, which constitute the principal reward in most floral nectar, many other substances are also found in nectar. Bitter substances, such as caffeine or nicotine, repel visitors at high concentrations, but at low concentrations can be attractive  and indeed have beneficial effects on pollinator learning and memory for floral traits . Different flower visitors can vary in their response thresholds for the repellent effect; for example, in Nicotiana attenuata flowers, moths responded more strongly to the presence of nicotine (repulsion) and benzyl acetone (attraction) in nectar than hummingbirds . RNA interference on only two loci can block the production of both benzyl acetone and nicotine, resulting in fewer visits and nectar removal by pollinators, which has fitness effects (for example, altering seed set), but it also alters damage by herbivores, and nectar theft by ants . This provides an example of how complex the interactions between selection and genetic associations can be in relationships between plants, pollinators and antagonists such as herbivores .
Beyond isolating floral mechanisms affecting pollinator behavior, genetic analysis provides a molecular phylogenetic context for the evaluation of correlated divergence between plants and their pollinators, as well as the demographic changes of populations. One of the earliest phylogeographic studies involved restriction site mapping and length polymorphism analysis of the honeybee Apis mellifera . Phylogenetic perspectives using both genetic and genomic information serve to clarify the origin of particular traits (for example, ) as well as the timing of adaptive radiation. It is these radiations that may allow inter- and intraspecific comparisons, particularly with respect to closely related out-groups with different phenotypes.
When considering the consequences of interactions between angiosperms and their pollinators, it is vital to consider two predominant processes influencing angiosperm diversity: interspecific hybridization and polyploidy (or whole-genome multiplication; Figure 2). At least 25% of species form interspecific hybrids , resulting in introgression of characters between species and, rarely, the formation of homoploid hybrid taxa . This process results in complex, reticulate patterns of evolution in many angiosperm groups, making it difficult or impossible to circumscribe species boundaries. In addition, polyploidy is widespread and is thought to have influenced all angiosperms, having occurred at the base of both the seed plants and angiosperms  and within most lineages thereafter, sometimes surprisingly frequently. For example, the apparently diploid species A. thaliana, a model species because of its small genome, has undergone three rounds of polyploidization since the divergence of angiosperms [74, 75], and the existence of natural cultivars with double the expected number of chromosomes reveal that the process is ongoing. Indeed, an analysis of chromosome counts across multiple genera led Wood et al.  to predict that polyploidy accounts for 15% of angiosperm speciation events. Furthermore, many polyploids form in association with interspecific hybridization (allopolyploidy [77, 78]).
A consequence of such a high prevalence of interspecific hybridization and polyploidy in the ancestry of angiosperms is that much diversity is generated. Recent advances in NGS approaches, including RNA-seq, reveal that these evolutionary processes can affect the genome and transcriptome, which in turn must govern changes to the metabolome and phenotype . At the genome level, there is much variation in the outcome of allopolyploidy, from considerable genomic restructuring, even between progeny of the same cross, or little at all, depending on the example . At the level of the transcriptome, there can be large-scale changes in both the nature (which genes are expressed) and in the total amount of RNA expressed  per cell, the significance of which is yet to be understood. Recurrent polyploidy in angiosperm evolution has also resulted in the evolution of genes in large multigene families . With the evolution of gene duplicates, there can be relaxed selection on one of the gene copies, enabling it to diverge, potentially to acquire new functions. This can lead to individual gene members of the family having different metabolic roles, as observed through their differing patterns of gene expression . For example, gene duplicates of anthocyanin-regulating transcription factors in Mimulus from Chile have evolved independently in separate tetraploid lineages to generate red floral pigmentation in flower lobes . Interspecific hybridization and allopolyploidy provide the opportunity for new, transgressive characters to evolve. Transgressive characters  can be observed at all levels from the transcriptome to inflorescence morphology [79, 82] and may arise through novel cis-trans interactions in the genome  or by 'mixing and matching' of different metabolic pathways found in the parents.
An example in which gene duplication has led to the evolution of a large plant gene family with key functions in plant-insect interactions is seen in the terpenoid synthase genes, encoding enzymes that synthesize volatile compounds called terpenoids that mediate interactions with mutualists and antagonists . Minor structural changes in terpenoid synthase genes are known to change the product of the encoded enzymes, allowing rapid evolutionary change in volatile compounds . Another example is the desaturase genes, which encode key compounds for pollinator attraction in the highly specific sexual mimicry systems in orchids of the genus Ophrys [18, 86]. Ophrys species have large, variable families of desaturase genes with high allele diversity and species-specific expression patterns. Variation in structural genes, as well as changes in their expression levels, are likely mechanisms allowing rapid evolutionary responses to fluctuating pollinator communities. A major challenge for understanding the link between genotypic and phenotypic variation, however, is the functional characterization of gene copies and different alleles [18, 87] and their respective roles in determining key traits in plant-insect interactions. This approach will require methods and statistics to be developed from GS studies to diverse natural situations, enabling the characterization of multiple genes with small effects, as already discussed.
The goal of these analyses is to accurately diagnose species-level interactions that cannot easily be documented in the wild. The most recent investigations have focused on predator-prey or herbivorous interactions rather than mutualistic interactions, in part because reference taxonomic databases of molecular information for animals have developed faster than those for plants. Interactions among species form the basis of ecosystem functioning and underlie evolutionary and ecological principles of conservation biology. However, directly measuring biological diversity is much simpler than characterizing the interactions between taxa . These relationships between species form the building blocks of food webs, and exploring the mechanisms structuring these interaction patterns is crucial in understanding their spatial-temporal variation and predicting their responses to disturbance. Knowing precisely which pollinators visit which plants in the wild is also vital to our measurements of selection, particularly when trying to clarify whether a particular pollinator is a true specialist or whether their behavior is flexible.
We expect there to be swift progress in this field since a plant barcoding community consortium  proposed that a combination of genomic regions (including the plastid regions rbcL and matK) should be used as the core barcode for land plants, to be supplemented by the plastid intergenic spacer trnH-psbA or the nuclear ribosomal internal transcribed spacers (ITS). The final selection of a standard plant barcode is leading to the rapid acquisition of databases of plant barcodes for taxonomic and biodiversity analysis (Box 1). The application of these databases to study mutualistic interactions between plants and animals is, however, in its infancy. In Hawaiian solitary bees, molecular identification of the pollen carried (using the ITS region) linked these grains to the local flora and revealed preferential foraging for pollen from native species . Similar extensions using trnL and other official plant barcode genes for the study of plant-animal interactions are under way by several research groups ([96, 97], and the authors) with the potential to rapidly diagnose thousands of individual pollination events simultaneously without the need for laborious morphological identification of pollen (Figure 3).
The accumulation of complete genome sequences is increasing quickly providing larger numbers of genome scaffolds (reference genomes) for the assembly of related taxa, an approach that will be exploited for our understanding of non-model species. As an example, the genome of the honeybee A. mellifera has been sequenced and is available as version 4 (about 236 megabases with 7.5X coverage, N50 contig = 41 kb, N50 scaffolds = 362 kb) . A survey of this genome revealed a higher than expected number of odor receptor genes and novel genes for nectar and pollen utilization, as well as more loci involved in learning and memory than have been observed in Drosophila . The number of olfactory receptor genes is more than twice as high as in other insects that have been examined, perhaps reflecting the importance of pheromones in the orchestration of social organization in honeybees but also underpinning the sophisticated olfactory learning abilities that bees show in floral visitation. Olfactory cues are also potentially important to bat pollinators . Olfactory receptors show substantial divergence in bats compared with other terrestrial mammals , although this is seen in both pollinating and non-pollinating species.
There are also significant phylogenetic insights that emerged from the honeybee genome, and comparison with other insect genomes. A survey of 185 genes from sequenced insect genomes suggests that the Hymenoptera share a common ancestor with all other holometabolous insects (beetles, moths and flies) . These authors propose  that previous phylogenetic hypotheses, in which this arrangement was not recovered, were based on poorly resolved analyses based on single loci or small numbers of loci. The split between Hymenoptera and other holometabolous insects is estimated to have occurred nearly 300 million years ago . It is likely that no study of any single or small number of genes could reliably reveal this ancestral relationship. Only the analysis of dozens of protein alignments derived from genome-level scans provides sufficient support to resolve the relationship. Well resolved phylogenies, particularly depicting ancient branching points, allow better characterization of evolutionary pressures involved in adaptive radiations, demographic shifts and major leaps in phenotypic evolution. This is particularly true when there is convincing evidence that diversification may be based, at least in part, on the selective pressure of two (or more) species involved in a mutualistic relationship.
The genomes of several insect-pollinated plant species have now been sequenced or are currently being sequenced (see [38–42] and [100–103]). An interesting model for studying the genomics of adaptive divergence is the genus Mimulus. Sequences of the whole genome and of expressed sequence tags (ESTs) of Mimulus guttatus (version 1.0 early release: about 322 Mb arranged in 2,216 scaffolds, about 301 Mb arranged in 17,831 contigs, N50 scaffold = 81, N50 contig = 1,770) are being assembled and are now available for browsing on the Phytozome website , containing approximately 321.7 Mb assembled (of 362 Mb , arranged in 2,216 scaffolds). There are extensive genetic resources for Mimulus, including ESTs, polymorphic markers, linkage maps, bacterial artificial chromosome libraries, seed stock and centralized repositories of this information for public use (, reviewed in ). Mimulus is a model system for studying adaptive phenotypic traits by analysis of pollinator selection, as it contains extensive phenotypic and genomic variation explored in both laboratory and field contexts. Furthermore, the potential for hybridization between species forms a continuum from high compatibility to reproductive isolation (reviewed in ), enabling the examination of interactions between floral traits and reproductive barriers. Although model species allow an understanding of evolutionary mechanisms of one particular system in detail, important insights will be gained by comparative approaches spanning the diversity of different plant-pollinator systems. This promises insights into animal sensory systems, floral signals and morphology that may co-evolve or be shaped by pre-existing biases . The availability of powerful sequencing and bioinformatics tools will allow us to study non-model systems, including those with very large genomes, allowing insights through comparative approaches [66, 100–103].
One obvious next step is to ask questions about parallel phenotypic diversification. Genome studies have the potential to uncover whether parallel phenotypic changes are caused by similar underlying genomic architecture (for example, parallel phenotypes associated with homologous genes and mutations), or achieved through many alternative genetic mechanisms . The exploration of parallel cases of divergence increases statistical power because evolutionary trends can be examined in replicate  and genomic approaches will add more dimensions through the analysis of hundreds or thousands of candidate genes and co-adapted gene complexes. Floral phenotypes are sometimes characterized by phylogenetically convergent adaptations to the requirements and preferences of certain groups of pollinators (pollination syndromes); for example, bat-pollinated flowers from multiple lineages often contain certain sulfuric components in their scent, to which bats seem to be innately attracted . There are also differences in the color perception of different classes of pollinators, such as with respect to differences in red sensitivity between bees and hummingbirds (see above) [61, 107]. Convergence in regulatory mechanisms was discovered for switches in floral color from blue pigmentation in bee-pollinated flowers to red pigmentation in principally hummingbird-visited flowers .
We expect further genomics studies to yield insights into timing and sequence of convergent evolutionary processes, allowing us to address the long-standing question as to whether some species act as 'models' that drive selection for phenotypic similarities in other (mimic) species . As in the honeybee phylogenetic example , thousands of loci increase the chance that true phylogenetic signal will overcome stochastic noise from loci with complicated selective histories. The exploration of convergence will provide novel insights into phylogenetic tracking as pollinators and plants co-diversify. This is particularly powerful when comparisons are made with a close relative that does not share the trait of interest (an outgroup) that can provide a basis to study adaptive evolution. Better phylogenetic resolution of deep branching structure in combination with an understanding of the genetic architecture of phenotypic plasticity and divergence will be a powerful tool.
The existence of convergent solutions in floral evolution can involve considerable underlying genetic architectural variation (many routes to a solution), as suggested for replicate adaptive radiations . There can be a surprising amount of genomic variation in natural populations on which natural selection may act , and one principal goal must be to quantify such variation for more species and populations. Plant-pollinator interactions are a perfect example of cases in which rapid adaptive radiations in both plant phenotype and pollinator choice set up parallel selective regimes. These are fundamental questions that can be addressed only by the comparison of genomes from parallel radiations of interacting species, which represent evolutionary replicates.
DNA (or molecular) barcoding can broadly refer to any system whereby genomic data are used to assist in, or as a surrogate for, systematic identification of living organisms. Such methods were first applied in organismal groups that present great difficulties for traditional taxonomic approaches because of their microscopic size, such as Plasmodium parasites  or soil nematodes . These cases showed the possibility of using short DNA sequences to essentially substitute for morphological identification, as sequences may be clustered into molecular operational taxonomic units (MOTUs) and used as a fast and practical means to identify strains or assess diversity. DNA barcoding was later proposed as a more formalized system  for linking sequence data to species identifications using standardized protocols. Barcoding in this strict sense uses a small number of universal marker genes (mitochondrial cytochrome c oxidase subunit I (COI) for animals, ITS for fungi, rbcL, matK and trnH-psbA or ITS for plants) and a single reference data repository, BOLD , in which sequence data are associated with validated taxonomic identifications. Among the advantages of this system are that a DNA sequence from any unknown (obtainable from fragmentary tissue samples, pollen grains, seed fragments, hair, feces, and so on) can be readily assigned to a species or other level of identification by matching against this database.
We thank Ian Baldwin, Toby Bradshaw, Teresa Crease, M Brock Fenton, Robin Floyd, Sian Lewis, Randy Mitchell and Nick Waser for comments and literature pointers.