Plant polycistronic precursors containing non-homologous microRNAs target transcripts encoding functionally related proteins

Functional homologous and non-homologous clusters of MIR genes that co-regulate target mRNA transcripts have been identified in plants


Background
MicroRNAs (miRNAs) are endogenous approximately 21nucleotide single-stranded small RNAs derived from MIRNA precursors that are able to fold-back into a stable secondary structure (stem loop or hairpin). miRNAs act in many developmental processes as well as environmental and pathogenic responses [1][2][3][4] through the post-transcriptional regulation of target mRNAs. These targets carry a sequence-specific miRNA recognition site, leading to transcript cleavage and/or inhibition of mRNA translation [1,5,6]. Primary miRNA transcripts (pri-MIRNA) are transcribed by RNA polymerase II, and several ribonucleoprotein (RNP) complexes are involved in their maturation, a process that differs between animals and plants [1,[6][7][8][9][10][11]. In animals, formation of an approximately 21-bp miRNA-miRNA* duplex successively involves two RNase III enzymatic complexes: the Drosha enzyme, which cleaves long pri-MIRNA in the nucleus to generate short (approximately 70-to 80-nucleotide) hairpins (so called pre-MIRNA) and the Dicer enzyme, which produces the miRNA after cytoplasmic export of pre-MIRNAs through Exportin 5 [11]. In plants, however, both cleavages are likely nuclear localized and involve a single Dicer-like enzyme 1 (DCL1) complex [6,9,10]. The miRNA-miRNA* duplex is exported to the cytoplasm by HASTY, the plant ortholog of Exportin 5 [12,13]. Subsequently, these duplexes are converted into single-stranded miRNAs upon incorporation into an ARGONAUTE (AGO) ribonucleoprotein complex, referred to as the RNA-induced silencing complex (RISC). The miR-NAs guide sequence-specific cleavage and/or translational repression of target transcripts into the RISC complex [6,[9][10][11].
Recent deep sequencing of plant small RNA libraries has led to the identification of more than 1,300 miRNAs in various plants (miRBase, release 13.0, March 2009) [14]. Based on comparison of all available plant genomes (even partial ones; 16 genera referenced in miRBase), evolutionarily conserved and non-conserved miRNAs have been proposed. The nonconserved miRNAs have probably emerged in recent evolutionary time scales, and show a wide diversity compared to the restricted number of conserved miRNAs [15]. Indeed, only 5 miRNA families are found in more than 40 plant species whereas 25 exist in more than one plant genus [16]. The three higher plant models showing the most comprehensive description of their miRNome are rice (Oryza sativa; 377 MIRNAs), poplar (Populus trichocarpa; 234 MIRNAs) and Arabidopsis (Arabidopsis thaliana; 187 MIRNAs), with 22 families 'conserved' between them (indicated in bold in Additional data file 1 based on miRBase 13.0). The numerous nonconserved miRNAs are thus likely to play species-specific roles [15].
Plant and animal MIRNA genes differ in their genomic location and organization. Most plant miRNAs are encoded in intergenic loci, whereas animal miRNAs are also frequently encoded within introns of protein coding genes [17][18][19]. Plant miRNAs are mainly generated from independent transcriptional units, whereas in Drosophila, nematodes, zebrafish and mammals, around 40 to 50% of the predicted MIRNA genes are located within clusters that are often evolutionarily conserved [18][19][20][21][22][23][24][25][26][27]. A maximal distance of 3 kb between two consecutive miRNAs has been used as a stringent criterion to estimate cluster numbers [18]. Clusters in animal genomes usually encode two to three miRNAs but some encode up to eight. Even larger miRNA clusters were predicted in human and zebrafish, containing more than 40 MIRNA loci [18,25,26]. In these clusters, miRNAs are encoded either in independent hairpins or sometimes in both arms of the same hairpin [28]. In plants, even though no systematic analysis of miRNA clusters has been performed in the different available genomes, a few miRNA clusters have been reported [16,[29][30][31][32][33].
Clustered miRNAs can be either simultaneously transcribed into a single polycistronic transcript or independently transcribed [1,28,34]. Short distances between consecutive MIRNA loci and coordinated expression of clustered miRNAs are hallmarks of polycistronic transcription [18,22,34]. Most of the few reported plant miRNA clusters contain several copies of the same conserved miRNA (miR156, miR166, miR169, miR395 or miR399), in contrast to animals where miRNAs with unrelated sequences are often included in the same clusters [18,19,25,35]. Interestingly, certain animal miRNA clusters showing co-regulated expression can simultaneously target transcripts encoding different functionally related proteins. It has been proposed that this may coordinate the fine tuning of the regulation of specific molecular processes [1,18,19,25]. Recently, functional analysis of two human miRNA clusters revealed that the different encoded miRNAs co-regulate related cyclin dependent kinase inhibitors and facilitate cell cycle progression [27]. In plants, beyond the identification of a few expressed sequence tags (ESTs) spanning miRNA clusters [16,[29][30][31][32][33], few experimental data indicate that clustered miRNAs are transcribed simultaneously. In the model legume Medicago truncatula, a miR166 tandem was shown to be encoded in a single transcriptional unit [32]. However, as both miRNAs are nearly identical, it is difficult to definitively conclude that this pri-MIRNA generates more than one miRNA.
In this study, we demonstrate that approximately 20% of plant miRNAs are clustered, and generally contain conserved miRNAs of the same family. Synteny analysis suggested a common evolutionary origin for certain clusters. Strikingly, a few clusters encode tandem non-conserved miRNAs with unrelated sequences, whose predicted targets correspond to transcripts encoding related proteins. In Arabidopsis, four of these clusters were transcribed as polycistronic precursors and we show that at least one cluster is processed to form both mature miRNA species in a DCL1-dependent manner. Accumulation of the mature miRNAs affected the stability of their respective predicted target transcripts. Consequently, plant polycistronic MIRNA precursors can encode functional nonhomologous miRNAs. This genomic organization may serve to co-regulate different mRNA targets post-transcriptionally.

Results
In silico identification of miRNA clusters in Arabidopsis, rice and poplar genomes A systematic search for consecutive MIRNA loci was carried out in three model plant genomes that have an exhaustive description of their miRNA species (miRBase 13.0 [14]). Initially, a 3-kb distance between consecutive MIRNA was used as a stringent criterion to define miRNA clusters, similar to previous studies in animals [18,26]. As a result, 16, 10 and 9 clusters were identified in rice, Arabidopsis, and poplar, respectively, which represented 13%, 11% and 8% of the total MIRNA loci (Table 1; Additional data file 2). Co-expression studies and ESTs available in animal genomes have indicated that some miRNA clusters can be very large; therefore, the 3kb criterion, which is useful to avoid overestimation of miRNA clusters, is probably too stringent [18,25,35]. Using a less stringent 10-kb cluster size criterion, the number of plant miRNA clusters increased to 18 to 24 in these genomes, thus representing up to 22% of the total MIRNA loci ( Table 1).
Independently of the size threshold used, most of the clusters (61%, 75% and 90% in Arabidopsis, rice and poplar, respectively) contained several copies of the same miRNA family, generally two to three and a maximum of eight (the latter is the rice Osa-MIR395m-s, x cluster spanning 497 bp; Table 1; Additional data file 2), and were therefore called homologous clusters. These clusters frequently contained conserved miR-NAs (Additional data file 1), and represent 90%, 54% and 44% of the clustered miRNAs in poplar, rice, and Arabidopsis, respectively (Table 1). Homologous clusters were found for miR166, miR169 and miR395 families (based on the <10-kb threshold). This suggested a putative common origin of these clusters, involving successive gene duplications and losses as described for animal miRNA clusters [18,28]. Analysis of these miRNA clusters using VISTA Plot [36,37] (Figure 1) revealed that some were syntenic between monocot and dicot plants (Figure 1a, b, f; two rice miR395 clusters and the Ath-MIR169i-n locus) or only within monocot plants (Figure 1c; a third rice miR395 cluster). Surprisingly, no miR395 syntenic locus could be retrieved in the poplar genome (Figure 1a-d).
Other miRNA clusters were specific to one plant genome analyzed (Figure 1d, e; the fourth rice miR395 cluster and the Ath-MIR166c, d locus). These results suggest that certain ancestral miRNA clusters appeared before the divergence of monocot and dicot lineages and showed differential expansions in the various plant genomes analyzed. Furthermore, clustering of specific miRNAs (for example, miR395, miR169, miR166) is evolutionarily conserved.

Non-homologous miRNAs are expressed as polycistronic pri-MIRNAs
When MIRNA loci at a distance of <1 kb on the same DNA strand were considered, 8, 9 and 14 miRNA clusters were identified in Arabidopsis, poplar, and rice, respectively (Table 1). In contrast, only four clusters encoding two miR-NAs on opposite strands were found (Additional data file 2). These results indicate that short range clustering (based on the 1-kb threshold) is strongly biased (χ 2 test, P = 9.4 E-13) towards 'same DNA-strand' clustering, suggesting an eventual co-transcription. As small clusters may correspond to polycistronic MIRNA precursors, we searched pri-MIRNAs containing several tightly linked miRNAs (Additional data file 2). For example, homologous rice miR395 clusters show the highest number of miRNAs, each encoded in independent stem-loops that are probably generated by successive dupli- Table 1 Summary of clustered miRNAs in Arabidopsis thaliana, rice (Oryza sativa) and poplar (Populus trichocarpa) genomes2

Populus trichocarpa Oryza sativa Arabidopsis thaliana
Number of clusters with consecutive miRNAs at a distant of (same strand miRNAs):  In addition to clusters encoding homologous miRNAs, several clusters (two in rice, three in poplar and five in Arabidop-sis) consisting of miRNAs with unrelated sequences were identified ( Table 2). These non-homologous clusters mainly corresponded to non-conserved miRNAs, and their size ranged between 271 and 1,192 bp, with predicted hairpins separated by 81 to 670 bp (median of 114 bp; Additional data file 2), strongly suggesting that these clusters were polycistronic. Most of them were encoded in regions located between protein-coding genes, with the exception of two rice clusters that derive from intronic regions of different putative ribos-Microsynteny between homologous miRNA clusters conserved between A. thaliana, rice (O. sativa) and poplar (P. trichocarpa) Figure 1 Microsynteny between homologous miRNA clusters conserved between A. thaliana, rice (O. sativa) and poplar (P. trichocarpa). VISTA plots [37,70] shows the conservation of different clustered miRNAs in the three selected genomes (Table 1; Additional data file 2): (a-d) the four rice miR395 clusters; (e) the Ath-MIR166c, d cluster; (f) the Ath-MIR169i-n cluster. To analyze evolutionary conservation between monocots, the sorghum genome is indicated. In each graph, gene models (blue for protein coding gene exons, and red for miRNA clusters) are indicated above, and percentage (50 to 100%) on the right side indicates the level of identity between target and reference genomes, visualized through pink and blue peaks for intronic/non-protein coding, and exonic regions, respectively. The name of syntenic clusters in non-reference genomes is indicated, and syntenic clusters are delimited by black bars. Ath, Arabidopsis thaliana; Ptc, Populus trichocarpa; Osa, Oryza sativa, Sbi, Sorghum bicolor. omal protein encoding genes (ESTs listed in Table 2). In most cases, non-homologous clusters encoded two miRNAs, with the exception of the Osa-MIR1876-1862d-1884b cluster. Similarly to the homologous miRNA clusters, this novel MIRNA precursor class contained several hairpins, each carrying a single annotated miRNA in the stem (Figure 2b; Figures S2 and S3 in Additional data file 3). A majority of 21-to 22-bp species was found for each predicted miRNA in Arabidopsis non-homologous clusters (Genome View browser, Arabidopsis Small RNA Project (ASRP) database) [38][39][40], and we systematically annotated miRNA* to identify bona fide miRNAs [41] ( Figure S4 in Additional data file 3). However, for Ath-MIR859, Ath-MIR397b and Ath-MIR857 loci, no miRNA* could be identified in small RNA databases.
Currently, the Arabidopsis genomic regions corresponding to candidate polycistronic non-homologous MIRNA genes are represented as independent transcriptional units in the ASRP database ( Figure S4 in Additional data file 3). Indeed, no EST comprising tandem miRNAs was available in Arabidopsis.
ESTs spanning the two MIRNA hairpins were reported only for the poplar Ptc-MIR482-1448 locus ( Table 2  Overall, these results suggest that clusters comprising functional miRNAs with unrelated sequences exist in plants as single transcriptional units, defining a novel class of plant pri-MIRNA genes.

Polycistronic non-homologous miRNAs regulate related targets
In contrast to animals, in silico predictions revealed only a few targets for each plant miRNA based on strong sequence complementarity [42]. Strikingly, all predicted targets corresponding to different miRNAs from the same cluster encode proteins of the same family (Table 2; complete target list in Additional data file 4; based on the ASRP database for Arabidopsis, and on [43][44][45] for poplar). Indeed, the Ath-MIR397b-857 locus encodes two miRNAs that regulate laccases (three and one gene, respectively), the Ath-MIR842-846 locus encodes two miRNAs that regulate JR/MBP proteins (Jacalin repeat/Myrosinase binding protein; one and ten genes, respectively), and the Ath-MIR859-774 locus encodes two miRNAs that regulate F-box proteins (35 and 5 genes, respectively). More importantly, three of these F-box proteins are likely to be targeted by both Ath-miR859 and Ath-miR774 (Additional data file 4). Similarly, the Ptc-MIR482-1448 locus encodes miRNAs that regulate disease resistance proteins (12 and 2 genes, respectively, based on various gene models [46]), and one of them is probably co-regulated by the two miRNAs (Additional data file 4). Finally, the two Ptc-MIR1446-477 loci encode miRNAs that both target a single 'gibberellin response modulator-like protein' homologous to the Arabidopsis RGL1/RGL2 (Repressor of gibberellic acid requiring (GA1)-LIKE; DELLA transcription factors [45]). For the other non-homologous clustered miRNAs, targets were either predicted for only one miRNA of the tandem (for example, a single EIF2 encoding transcript -The Arabidopsis Information Resource database entry TAIR:At1g76810 -for Ath-miR771), or no target could be identified (Table 2). Target validation based on 5' RACE (5' rapid amplification of cDNA ends) PCR experiments was determined in Arabidop-  [43,44,46].
These analyses suggest that non-homologous miRNA polycistronic clusters are likely to target transcripts encoding proteins of the same family, suggesting that co-transcription of miRNAs may co-regulate their respective targets.
The polycistronic Ath-MIR859-774 pri-MIRNA is processed by a DCL1-dependent pathway To determine the functionality of a non-homologous polycistronic pri-MIRNA in planta, the Ath-MIR859-774 locus was selected. Expression of several Ath-MIR859-774 predicted targets encoding F-box proteins was analyzed in different organs (Figure 3c). Among validated targets, the Ath-miR774 target TAIR:At3g19890 and the Ath-miR859 target TAIR:At3g49510 exhibited detectable expression in roots that also express Ath-MIR859-774 pri-MIRNA. This indicates that both partners of this post-transcriptional regulation are present in this organ. We then overexpressed this precursor to analyze the transcriptional regulation of these target genes (Figure 4). Independent lines accumulating the pri-MIRNA transcripts at high levels in wild-type (Col-0) Arabidopsis plants were obtained (Figure 4a). Northern blot analyses showed accumulation of both mature approximately 21-bp miRNAs corresponding to Ath-miR859 and Ath-miR774 in comparison to control plants (expressing an empty vector; Figure 4b). In addition, significant down-regulation of both the Ath-miR859 target TAIR:At3g49510 and the Ath-miR774 target TAIR:At3g19890 was observed in these transgenic lines (Figure 4c). These results indicate that both miR-NAs from the Ath-MIR859-774 polycistronic pri-MIRNA can be processed and simultaneously co-regulate the expression of their respective predicted targets.

Discussion
A comparative genomic analysis of miRNA clustering in three model plants (a monocot, rice, a herbaceous dicot, Arabidopsis, and a dicot tree, poplar) led us to identify a novel class of Expression of Arabidopsis polycistronic non-homologous miRNA clusters and selected targets in different organs  [15], and TAIR:At3g19890 for Ath-miR774 [48]. At3g18780 encoding an ACTIN isoform was used as RNA loading control. polycistronic MIRNA precursors encoding miRNAs with unrelated sequences. These non-homologous miRNA clusters target transcripts encoding proteins of the same family, suggesting that this unusual genomic organization may allow coregulation of different but related targets.
Most miRNA clusters encode several copies of conserved miRNAs from the same family, that is, miR166, miR169, or miR395. Previous analyses of miR395 clusters in rice and M. truncatula, as well as a miR156 cluster in rice, maize, sugar-cane, sorghum and even a dicot (Ipomea nil), have suggested conservation of homologous miRNA clusters in various plant genomes [16,29,30]. Our analysis revealed certain homologous miRNA clusters at syntenic genomic positions, implying a common evolutionary origin across monocot and dicot lineages. Specific miRNA families seem positively selected for expansion and clustering in several genomes. For cultivated species, it has been proposed that this spreading may contribute to advantageous agricultural traits [29,30]. In addition, homologous miRNAs or cluster duplication may lead to the  emergence of new spatio-temporal expression patterns through the accommodation of alternative promoter regions [29,52,53].
A combination of tandem duplication of miRNAs as well as segmental duplications of whole clusters has been proposed to explain such genomic organization [29,52]. In animal genomes, miRNAs encoded in the miR17 cluster arose through a complex duplication and loss of individual members as well as duplications of entire clusters [28]. In plant genomes, miR156, miR160, miR162, miR167, miR169, miR171 and miR395 families experienced large expansions via tandem or segmental duplication events and loss of family members ( [29,30,52] and this study). This is in agreement with the detection of two to three miRNAs in most (that is, 70 to 80%) of the clusters in our analysis, similar to protein coding gene clusters [52]. These duplication events may therefore represent a major evolutionary route for birth and death of miRNAs in plants.
Folding of putative transcripts derived from homologous miRNA clusters revealed additional hairpins in the rice Osa-MIR395h-l and Osa-MIR395m-s clusters, which were not annotated in miRBase. In animal genomes, systematic fold-Ath-MIR859-774 polycistronic pri-MIRNA is processed by a DCL1-dependent pathway  ing of genomic regions encoding miRNA clusters has helped to identify additional miRNAs [18]. A recent analysis of rice miRNA clusters has revealed a different genomic organization of upstream sequences corresponding to their promoters [53]. Osa-MIR156b-c, Osa-MIR166k-h, Osa-MIR169n-o, Osa-MIR172b-806a, Osa-MIR395a-g, Osa-MIR395h-l, and Osa-MIR395m-s clusters may contain only one promoter and be transcribed as polycistronic units. Interestingly, we found that the Osa-MIR395t-w cluster was specific to the rice genome. This cluster has previously been reported as having no predicted promoter [53]; a transposable element identified in its vicinity [29] may be associated with the recent evolution of this MIRNA cluster [6,54].
Our results indicate that short range (<1 kb) clustering of 'same DNA strand' miRNAs are highly suggestive of co-transcription as reported in animal genomes [18]. Accidental formation of hairpins followed by loss of miRNAs subsequent to duplication was indeed proposed as a general mechanism for the origin of polycistronic MIRNA transcripts in animals [28]. Although the clustered miRNAs characterized were always encoded in independent hairpins, a stem-loop encoding the rice miR159 was recently shown to produce additional approximately 21-to 24-nucleotide small RNAs from the 21 bp next to the miR159 sequence [33]. This unusual case is reminiscent of sequential DCL1-dependent processing of the Arabidopsis miR163, and of DCL4-dependent processing of tasiRNAs (trans-acting siRNAs) or young Arabidopsis miR-NAs, which may correspond to 'proto-miRNAs' [6,55,56]. Our results show that maturation of the Ath-MIR859-774 polycistronic cluster is mediated by DCL1, but we cannot exclude that other (DCL) enzymatic complexes may contribute to the processing of polycistronic MIRNA precursors.
In each of the three plant genomes, we identified several clusters encoding distinct miRNAs, in addition to clusters containing homologous miRNAs. Their low abundance in plant genomes contrasts with animal genomes, where miRNA clusters frequently encode miRNAs from different, although evolutionarily related, families, for example, the miR17 gene cluster [25,28]. These clustered non-homologous MIRNA genes are proposed to simultaneously regulate multiple functionally related genes in animals. Indeed, a recent study has demonstrated that two human miRNA clusters regulate various cyclin dependent kinase inhibitors, leading to a coordinated regulation of cell cycle progression [27]. In contrast to animals where hundreds of translational targets are frequently predicted for a single miRNA, plant miRNAs target few transcripts, usually showing an extensive homology with the miRNA leading to its cleavage [42] (Table 2; Additional data file 4). Although the recent identification of translational regulation in plants may affect this view, all known translationally regulated targets presently contain binding sites highly homologous to miRNAs [5]. Interestingly, we show that all predicted targets of the different non-homologous miRNAs present in a single cluster always corresponded to proteins of the same family.
Ath-MIR859-774 and a representative target of each miRNA were mainly expressed in the roots. However, anti-correlation between the MIRNA precursor and target transcript levels was not identified in the different organs tested. This could be due to the fact that several plant miRNAs quantitatively regulate gene expression and a low level of variation in a specific organ could not be detected [6]. Indeed, both miRNA and targets were expressed at low levels in each organ tested (Figure 3; Figure S4 in Additional data file 3). Additionally, spatial expression domains of the miRNAs and their targets may vary in the different cell types constituting an organ, resulting in non-significant differences at the whole organ level or even positive correlations ( [6,57] and references therein). Furthermore, post-translational regulations may be superimposed upon post-transcriptional regulations, as in the case of another recently evolved plant miRNA, miR834, initially suspected to be inactive [5]. In the latter case, the absence or near absence of transcriptional anti-correlation between miRNA and target transcripts suggests that posttranslational regulation is predominant over mRNA regulation.
Ectopic expression of the Ath-MIR859-774 pri-MIRNA led to the simultaneous down-regulation of distinct F-box transcripts, which are likely to be independently regulated by each miRNA. F-box proteins co-regulated by Ath-MIR859-774 may participate in specific pathways involving proteasomedependent degradation of signaling components [58]. Ptc-MIR1446-477 loci are predicted to target a DELLA-like transcription factor similar to the Arabidopsis RGL1/RGL2 proteins involved in gibberellin control of seed germination and floral development [45], and shoot and root development in poplar [59]. The Ath-MIR397b-857 targets transcripts encoding laccase copper proteins associated with lignin synthesis, metal nutrition and response to abiotic stresses [50,60]. Among the four laccase encoding transcripts targeted by these miRNAs, the knock-out mutant of TAIR:At2g29130 (AtLAC2) shows slightly reduced root elongation under osmotic stress. Finally, miRNAs derived from the Ath-MIR842-846 loci target transcripts encoding related JR/ MBP, while Ptc-MIR482-1448 miRNAs target transcripts encoding disease resistance proteins. Both pathways may affect pathogen defense responses [44,61]. Co-transcription of similar or identical miRNAs has been proposed to have a dosage effect on target expression [29]. Co-expression of different miRNAs may serve to increase the efficiency of the regulatory process. Whereas different miRNAs have been shown to bind a single mRNA target in animal systems to cooperatively control its expression [19,62], only three Ath-MIR859-774 targets were predicted to be recognized by both miRNAs. This result might be biased due to the restrictive criteria used in plants to predict targets, in contrast to animal genomes [42].
MIRNA genes are proposed to originate from the duplication of a target gene [6,15,56,63]. In the case of polycistronic nonhomologous MIRNA precursors, we could hypothesize that the duplication of a single target locus may have led to the selection of two divergent 'proto-miRNA' regions targeting other members of the family. An alternative is the duplication of an overlapping region between two clustered target genes, leading to the selection of miRNAs that target both clustered ancestral genes. Indeed, predicted targets of tandem polycistronic non-homologous MIRNA precursors are often themselves clustered (Additional data file 4), notably the F-box proteins targeted by Ath-MIR859-774 and the laccases targeted by Ath-MIR397b-857 (37 clustered F-box proteins and 4 clustered laccases).
Our results show that plant genomes generally contain less clustered or polycistronic miRNAs than animal genomes. Indeed, approximately 20% of total plant miRNAs are clustered, whereas in animals they represent approximately 50% using a similar criterion (that is, cluster size up to 10 kb) [18]. In animals, the Drosha complex specifically catalyzes maturation of long pri-MIRNAs, including the numerous polycistronic clusters, into approximately 70 nucleotide pre-MIRNAs hairpins [6]. In plants, however, a Drosha-like enzyme is lacking. We have shown that the processing of at least one Arabidopsis polycistronic MIRNA is DCL1-dependent, similar to most non-polycistronic MIRNA precursors. We can speculate that the occurrence of a single step maturation process of polycistronic precursors in plants may not be functionally equivalent to the two-step process existing in animals.

Conclusions
In contrast to plants, clusters of miRNAs are frequently present in animal genomes. Our comparative genomic analysis in three model plants (rice, poplar and Arabidopsis), however, has demonstrated the presence of several clusters containing two to eight miRNA species. Certain ancestral miRNA clusters appeared before the divergence of monocot and dicot lineages, and showed differential expansions in plants. Specific miRNA clusters (such as those coding for miR395, miR169 and miR166) are highly conserved. Interestingly, other clusters comprise functional miRNAs with unrelated sequences (non-homologous miRNAs) and are expressed as single transcriptional units, defining a novel class of plant pri-MIRNA genes. These polycistronic nonhomologous miRNAs regulate related target genes and are processed by a DCL1-dependent pathway. This mechanism paves the way for using polycistronic MIRNA precursors as a new molecular tool in plants to simultaneously express artificial miRNAs [64] that control the expression of different genes.

Plant genotypes and growth conditions
The wild-type Columbia (Col-0) ecotype of A. thaliana was used, as well as a dcl1-9 mutant backcrossed five times to Col-0 [51,56]. All plants were grown in long day conditions (16-h light/8-h dark photoperiod) at 23°C. Inflorescences, stems and cauline leaves, or rosette leaves were collected from 3week-old greenhouse-grown plants. Roots were collected from seedlings grown 3 weeks in vitro on 1/2 Murashige and Skoog (MS) medium (Sigma, Lyon, France) supplemented with 1% sucrose (Sigma, Lyon, France).

Bioinformatic analysis
Arabidopsis, poplar and rice miRNA sequences (mature and precursor) were downloaded from the microRNA Registry version 13.0 [65]. miRNA coordinates, chromosome locations and DNA strand orientation were retrieved from the micro-RNA Registry.
MIRNA genes were sorted by their chromosome locations and coordinates to identify miRNA clusters. The distance between two consecutive MIRNA loci was calculated by subtracting the start coordinates of the downstream pre-MIRNA (that is, hairpin) to the end coordinates of the upstream pre-MIRNA. MIRNA loci located within a distance of less than 1, 3 or 10 kb were considered to define the best candidates for polycistronic clusters and clusters with stringent or nonstringent criteria, respectively. The DNA strand containing the miRNA sequence was considered in these analyses.
Secondary structures were predicted using the mfold program [66] with default parameters [67] and a window size between 1 and 2 kb depending on MIRNA precursors.

Conservation analysis of miRNA clusters between plant genomes
Conservation between selected clustered miRNAs in Arabidopsis, poplar, sorghum and rice genomes as well as determination of candidate orthologous regions were determined using Genome VISTA [36,37]. Query sequence (1 to 1.5 kb depending on clusters) was anchored on the reference genome by local alignment matches and then globally aligned to candidate regions in different selected genomes based on the AVID program [68,69]. Alignments were then displayed with the VISTA graphic server [70]. Identified syntenic regions were manually inspected to identify and annotate orthologous miRNA clusters.

Northern blot analysis of small RNA expression
Tissues were frozen in liquid nitrogen, ground to a fine powder with a mortar and pestle, and then homogenized in TRI-Reagent ® (Sigma, Lyon, France) supplemented with β-mercaptoethanol.  [71]. Blots were hybridized with gamma-ATP 32 P end-labeled oligonucleotides (20 pmoles) complementary to miRNAs, and at the same time with an end-labelled oligonucleotide U6 RNA probe as loading control.

Analysis of gene expression by RT-PCR
Total RNAs were extracted using the total RNA Isolation kit (Macherey-Nagel, Düren Germany Real-time RT-PCR was performed on an Eppendorf Mastercycler ® realplex real-time PCR system (Eppendorf, Hamburg, Germany) using FastStart Universal SYBR Green Master Mix (Rox) from Roche Applied Science (Meylan, France). Technical triplicates were done for each datapoint, and two independent biological replicates (per condition and/or transgenic line) were assayed. Normalization was done with averaged reference genes TAIR:At1g13320, TAIR:At4g26410, and TAIR:At5g15710 [72], which were systematically validated under our experimental conditions using Genorm software [73].

Cloning and transgenic plants
Firstly, pri-MIRNA Ath-MIR859-774 was amplified by RT-PCR from seedling cDNA and cloned into pCR8 ® /GW/ TOPO ® TA Cloning ® vector (Invitrogen, Paisley, UK). The construct was then transferred to the destination vector pEarlyGate103 [74] using the LR recombination kit (Invitrogen, Paisley, UK). These constructions (based on the 35S-CaMV promoter) were used to transform A. thaliana plants by floral dipping [75]. Transgenic plants were selected in T1 generation by spraying seedlings with Basta ® solution (120 mg/L glufosinate ammonium; Bayer CropScience, Monheim am Rhein, Germany) successively at 12, 14, and 16 days after germination. Basta-resistant plantlets were then tested for transgene expression by real time RT-PCR as described above. Since amplification across the successive hairpin regions of the Ath-MIR859-774 pri-MIRNA was not efficient and quantitative enough for real time RT-PCR analyses, a GFP mRNA present 3' of the pEarlyGate103 vector cloning site, for which efficient and specific primers were available (Additional data file 5), was used as a 3' tag to analyze transgene expression.

Authors' contributions
FM carried out the molecular genetic studies. AB conceived the study and carried out the bioinformatic analyses. MC drafted the manuscript and participated in its coordination. FF conceived the study, designed experiments and wrote the manuscript. All authors read and approved the final manuscript.

Additional data files
The following additional data are available with the online version of this paper: a table listing the conserved and nonconserved miRNAs in Arabidopsis, rice and poplar genomes (Additional data file 1); a table providing a detailed list of clustered miRNAs in Arabidopsis, rice and poplar genomes (Additional data file 2); a PDF including Figures S1 to S4 (Additional data file 3); a table providing a detailed list of all targets predicted for Arabidopsis non-homologous polycistronic miRNA clusters (Additional data file 4); a table listing primers used in this study (Additional data file 5).
Additional data file 1 Conserved and non-conserved miRNAs in Arabidopsis, rice and poplar genomes Conserved and non-conserved miRNAs in Arabidopsis, rice and poplar genomes. Click here for file Additional data file 2 Clustered miRNAs in Arabidopsis, rice and poplar genomes Clustered miRNAs in Arabidopsis, rice and poplar genomes. Click here for file Additional data file 3 Figures S1 to S4 Figure S1: secondary structures of the rice osa-MIR395a-g and osa-MIR395h-l,y putative polycistronic homologous miRNA clusters. Figure S2: secondary structures of poplar and rice putative polycis-tronic non-homologous miRNA clusters. Figure S3: secondary structures of four Arabidopsis non-homologous polycistronic miRNA clusters: Ath-MIR397b-857, ath-MIR842-846, ath-MIR850-863, and ath-MIR851-771. Figure S4: