- Open Access
Identification of conserved gene structures and carboxy-terminal motifs in the Myb gene family of Arabidopsis and Oryza sativa L. ssp. indica
© Jiang et al.; licensee BioMed Central Ltd. 2004
- Received: 22 December 2003
- Accepted: 29 May 2004
- Published: 29 June 2004
Myb proteins contain a conserved DNA-binding domain composed of one to four repeat motifs (referred to as R0R1R2R3); each repeat is approximately 50 amino acids in length, with regularly spaced tryptophan residues. Although the Myb proteins comprise one of the largest families of transcription factors in plants, little is known about the functions of most Myb genes. Here we use computational techniques to classify Myb genes on the basis of sequence similarity and gene structure, and to identify possible functional relationships among subgroups of Myb genes from Arabidopsis and rice (Oryza sativa L. ssp. indica).
This study analyzed 130 Myb genes from Arabidopsis and 85 from rice. The collected Myb proteins were clustered into subgroups based on sequence similarity and phylogeny. Interestingly, the exon-intron structure differed between subgroups, but was conserved in the same subgroup. Moreover, the Myb domains contained a significant excess of phase 1 and 2 introns, as well as an excess of nonsymmetric exons. Conserved motifs were detected in carboxy-terminal coding regions of Myb genes within subgroups. In contrast, no common regulatory motifs were identified in the noncoding regions. Additionally, some Myb genes with similar functions were clustered in the same subgroups.
The distribution of introns in the phylogenetic tree suggests that Myb domains originally were compact in size; introns were inserted and the splicing sites conserved during evolution. Conserved motifs identified in the carboxy-terminal regions are specific for Myb genes, and the identified Myb gene subgroups may reflect functional conservation.
- Additional Data File
- Splice Pattern
- Motif Candidate
- Extension Motif
- R2R3 Domain
Regulation of gene expression at the level of transcription controls many important biological processes in a cell or organism. The process of transcription recruits a number of different transcription factors, which can be activators, repressors, or both . Genome-wide comparisons have revealed the diversity in the regulation of transcription during evolution. With the completion of Arabidopsis genome sequencing, 5% of its genome was found to encode more than 1,500 transcription factors . On the basis of sequence similarities, transcription factors have been classified into families. In plants, Myb factors comprise one of the largest of these families.
Myb proteins are defined by a highly conserved DNA-binding domain (termed the Myb domain) composed of one to four helix-turn-helix motifs, which exist as tandem repeats (referred to as R0R1R2R3) in a single Myb protein. Each repeat is about 50 amino acids long, with regularly spaced tryptophan residues, and forms three α-helices The third α-helix has a recognition role during DNA binding . The three-dimensional structure of the Myb domain in the Protein Data Bank (PDB) shows that the DNA recognition α-helix interacts with the DNA major groove. Moreover, previous research indicated that five amino-acid residues in the helix-turn-helix motif bind directly to the major groove . It should be noted that sequences outside the Myb domain are highly divergent.
The first Myb gene found was the v-Myb oncogene from the avian myeloblastosis virus . Subsequently, members of the Myb gene family were identified in diverse plants and animals [6, 7]. Previous research showed that animal genomes encode relatively few Myb genes . In contrast, flowering plants contain large numbers of Myb genes with very diverse structures and functions . To date, the precise functions of most plant Myb genes are unknown, although some well studied examples suggest important roles for Myb genes in regulation of secondary metabolism, cellular morphogenesis, pathogen resistance, and responses to growth regulators and stress [6, 8].
With the completion of Arabidopsis and rice (Oryza sativa L. ssp. indica) genome sequencing , the entire complement of Myb genes can be identified and described. However, a great deal of experimental work is required to determine the specific biological function of each gene. In Arabidopsis, R2R3 Myb gene-expression levels were determined in more than 20 different growth conditions; the results indicated that Myb genes were specifically expressed in different tissues and physiological conditions . To obtain further functional information on Arabidopsis Myb genes, a process of reverse genetics was applied to isolate insertion mutants. In all, 47 insertion mutants were detected in 36 distinct Myb genes by screening a total of 73 genes. However, none of the insertions gave rise to morphological phenotypes visible in soil-grown plants . The redundancy of Myb genes may diminish the efficiency of the molecular approach by complementation of function. No similar research has been done in rice Myb genes. Here, we have used phylogenetic and computational methods to classify Myb genes in subgroups. The resulting subgroup classification and putative functional conserved motif identification may be useful for research on agronomic traits in rice, which is the most important crop for human consumption, and an important model for other cereal grains.
Expansion of Myb genes in Arabidopsisand rice
The Myb gene family has broadly expanded in plants during evolution. The amplification of the Myb gene family occurred before the divergence of monocots and dicots . In our study, 130 Myb genes were found in the Arabidopsis genome and 85 in Oryza sativa L. ssp. indica. The large size of this gene family was also confirmed in Zea mays and sorghum . Although most plant Myb genes contain only two repeats, there have been three-repeat Mybs reported in Arabidopsis , maize  and other plants . To date, only three-repeat Myb genes have been detected in animals, and it has been proposed that two-repeat Myb genes died out in the animal lineage . The broad presence of three-repeat Myb genes in diverse species indicates the antiquity of these genes. Using homology search in the GenBank non-redundant database, two three-repeat Myb proteins (accession numbers NP_913483 and BAC79618.) were identified in Oryza sativa L. ssp. japonica. However, no three-repeat Mybs were detected in rice (indica) in our study. This could be due to the incompleteness of the Oryza indica dataset.
Topology of Myb gene phylogeny
Interestingly, AtMyb33, 65, 101, 104 and At3g60460 were complementary, with few mismatches, to Arabidopsis Myb microRNA (noncoding RNA) miR159 . The sequence is 21 nucleotides long and located in the 3' untranslated region (3' UTR) of all five genes. MicroRNAs are proposed to act as regulators of gene expression through interactions with complementary mRNA sequences. Importantly, these five Arabidopsis Mybs are located in subgroup G12 (Figure 1). This clustering provides additional evidence for the reliability of the subgroup designations in our analysis.
Conserved gene structure within each subgroup supports the subgroup designations
Interestingly, the Myb gene splicing patterns constitute four major blocks in the Myb gene phylogeny (Figure 1). Block A lacks both introns 1 and 2. There are three splicing patterns in block B: subgroup G15 lacks both introns; subgroup G17 lacks only intron 2; and the remaining genes have altered splicing sites when compared to subgroup G13. Myb genes in block C have the major splicing pattern (81.2%) typified by G13, with some individual genes lacking intron 1 (9.4%), intron 2 (4.7%) or both introns (1.9%), or having minor splicing patterns (2.8%). In contrast, 58.2% of Myb genes in block D retain the typical splicing sites, and the rest lack only intron 1 (G02, G05 and half of the genes in G06).
In addition to splice-site locations, we also examined the position of splicing with respect to the open reading frame (ORF) - the intron phase. The splicing of each intron is designated as occurring in one of three phases: in phase 0, splicing occurs after the third nucleotide of the first codon; in phase 1, splicing occurs after the first nucleotide of the single codon; and in phase 2, splicing occurs after the second nucleotide. Figure 2a shows not only the conserved locations in the Myb-domain protein sequences but also the conserved phases of introns within the same subgroup. Moreover, there is a significant excess of phase 1 and 2 introns as well as an excess of nonsymmetric exons in Myb genes. Symmetric exons are exons that are flanked by introns of the same phase.
According to the intron-early theory , an excess of phase 0 introns and symmetric exons may facilitate exon shuffling by avoiding interruptions of the ORF, and thus could accelerate the rate of recombinational fusion and exchange of protein domains. Our results suggest that ancient Myb genes had a compact size without introns. During evolution, under some unknown mechanisms, introns were inserted into Myb domains and resulted in the observed splicing patterns. One splicing pattern remained unchanged in the subsequent gene amplification, resulting in the major splicing pattern typified by G13. Consistent with this, transposition of introns occurs very infrequently during evolution . This intron-gain model is consistent with previous results showing that numerous introns have been inserted into plants and retained in the genome . A similar approach to gene classification using intron/exon structure has been applied in the kinesin family  and the bHLH family , and the results support a similar evolutionary pattern.
Although the splicing sites are conserved, the sizes of both introns vary greatly for different Myb genes. Approximately 85% of introns 1 and 2 of Myb genes is shorter than 300 bp in Arabidopsis and rice. Detailed information about the distribution of intron sizes of Myb genes is available in Additional data file 2. It is worth noting that the size of intron 2 of maize p1 and p2 orthologs is very large, around 5 kb. This intron-size information may be helpful for aligning expressed sequence tags (ESTs) with genomic sequences.
Strikingly, a 743-base fragment was found in intron 2 of maize P1-rr and P1-wr alleles, but not in P1-rw and p2 alleles. A 10-base direct repeat (5'-TGATTTTGAC-3') flanks this fragment. Interestingly, no Ac elements were found inserted in its adjacent 3.2-kb intronic region, but frequent Ac insertion occurred in other regions. This could be due to a particular chromatin structure refractory to Ac insertion in this region . BLAST search detected this fragment (94% identity over 723 base-pairs (bp)) at one other locus in the maize genome, but with a new flanking direct repeat (5'-GGATATCCA-3'). The GenBank accession number is AF466202 (located 84795..85689, 12 March 2002 version). These results are consistent with a previous proposal that some transposable elements could insert into the genome as intronic sequences, a mechanism that has been proposed for the insertion of nuclear introns .
Topology of Myb gene phylogeny may reflect functional conservation
Most plant Myb genes are thought to encode transcription factors that activate or repress target gene expression either independently or together with cofactors. The topology of Myb phylogeny (Figure 1) indicates that some Myb genes in the same subgroup have the same function and that some Myb genes with similar functions are located in the same subgroup. For example, two Myb orthologs, snapdragon PHAN gene and maize rs2 gene, are located in subgroup G18, and both are involved in organ development: PHAN has been shown to regulate the development of the proximo-distal axis and dorso-ventral asymmetry of lateral organs such as leaves, bracts and petal lobes , while the rs2 gene controls the development of maize lateral organ primordia by repressing expression of knox (knotted1-like homeobox) genes that are required for the normal initiation and development of lateral organs . In another example, the Arabidopsis genes GL1 and WER located in subgroup G07 are both involved in epidermal cell development: GL1 activates the GLABRA2 homeobox gene for trichome (hair cell) development in some parts of the leaf and in the stem [25, 26], while WER controls the formation of the root epidermis by regulating expression of the GLABRA2 gene .
Similar results are observed for Myb genes involved in the phenylpropanoid biosynthetic pathway (Figure 1, subgroups N08, N09, N14): C1 , Pl, TT2 , AN2 , p1 , p2 , FaMyb1 , PAP1 and PAP2 . These genes all encode a transcription factor that activates enzymes for phenylpropanoid synthesis, except that the FaMyb1 transcription factor suppresses anthocyanin and flavonol accumulation . In addition, the functional conservation among some Myb genes during evolution could be observed in the cell-cycle protein CDC5 (Figure 1, G19). The CDC5 protein performs an essential function in cell-cycle control at G2/M, and also participates in pre-mRNA splicing .
The extent of the Myb R1, R2 and R3 repeats is based on similarity to the previously-published consensus Myb repeat sequences . We used computational methods to identify additional conserved sequences downstream of the Myb repeats. A total of 18 motifs were identified in the carboxy-terminal regions, with each motif ranging in size from 9 to 32 amino acids (Figure 1). An exceptionally large domain (91 residues) was found in subgroup G19, gene CDC5, which is a conserved Myb paralog that originated prior to Myb-family amplification . In addition, Myb genes maize C1, Pl and AtMyb123 (TT2) in subgroup N08 have a nine-amino-acid motif previously reported [1, 37]. This motif has a high e-value (4.5e-008), so it was excluded from our analysis. Three other motifs identified by Stracke et al.  were excluded from our analysis because of their high e-values (see Additional data file 3).
Consensus sequences of carboxy-terminal motifs
The average dN values of carboxy-terminal motifs and their flanking regions
Specificity of motifs to Myb genes
We wished to determine whether the detected motifs are specifically present in Myb genes, but are absent in non-Myb genes. Therefore, we used protein motif sequences as query sequences and performed Blastp searches in the Swiss-Prot database. For the motifs with size equal to or less than 15 amino acids, the homologous hits with 85% of the query motif length are all Myb domain containing proteins. For the motifs with size greater than 15 amino acids, the corresponding homologous hits with 70% of the query motif length contain Myb domains. The search result is described in Additional data file 4.
We obtained similar results in EST searches. When translated into proteins, all the 14 ESTs detected from an extension motif search also contain a Myb domain. Interestingly, we detected more ESTs from E1 than from the other extension motifs. Most probably this is due to the presence of E1 in more Myb genes than other motifs (Figure 1). The search result is described in Additional data file 5.
After comparing these two search results, we found that not all carboxy-terminal motifs detected homologous ESTs. This could be due to the low levels of expression of some Myb genes so that their EST sequences are not yet available. In some cases, these ESTs did not contain Myb domains; however, because the carboxy-terminal motifs are located downstream some distance from the Myb domains, the returned ESTs are probably too short to reach the Myb domains. However, alignment of ESTs with known Myb genes showed high identity not only in the motif sequence but also for considerable lengths in the flanking regions. This suggests that such ESTs are very likely from Myb genes.
Interestingly, we checked each carboxy-terminal motif in the 258 Myb proteins and found they are subgroup specific. For example, motif 1 from subgroup G03 was not detected in other subgroups.
Identification of regulatory elements in noncoding regions
In addition to the carboxy-terminal motifs detected in the predicted Myb proteins, we wanted to test whether any conserved DNA sequence motifs could be identified among the Myb gene subgroups. We applied motif-searching tools to detect conserved regulatory elements in the promoter region plus 5' UTR of the Myb genes and in intron regions. In contrast to the carboxy-terminal coding regions, no conserved DNA sequence motifs were identified in the Myb gene noncoding regions. This could be due to the fact that the Myb genes clustered in each subgroup are probably not orthologs or paralogs. In contrast, within the subgroup N14 (Figure 1) containing the maize p1 and p2 genes, and orthologs/paralogs from sorghum and rice, a highly conserved scheme of TATA-box, transcription start site sequences, and 5' UTR CA-box were found (data not shown). Otherwise, no significant regulatory elements were detected in noncoding regions of other Myb genes. However, it should be noted that segments of intron sequence closer to flanking exons are significantly more conserved than interior intron sequence. It has been reported that this level of intron sequence conservation may have a functional role in gene regulation . Our results suggest that it will be difficult to directly identify regulatory motifs in noncoding regions using only existing computational techniques. The chance of identification of regulatory elements will be increased in orthologs/paralogs. Possibly, the identification of co-regulated genes using microarray analysis will assist in the identification of common regulatory elements.
The expansion of Myb genes in plants makes it one of the largest families of transcription factors known to date. However, the specific roles of Myb genes in regulating plant traits are still unclear. Here, we used overall sequence similarity to cluster Myb genes from Arabidopsis and rice into 42 subgroups. The subgroup designations were well supported by sequence similarity and exon-intron structure. In one subgroup, significant complementarity to a specific miRNA was also observed. Furthermore, we found that the splicing sites and the phase of introns are conserved in Myb domains within the same subgroup, but differ between subgroups. The phylogenetic topology of splicing patterns suggested that Myb domains may originally have been compact in size, and that introns were inserted and remained in place during evolution. Computational searches were used to identify conserved carboxy-terminal motifs present in the different subgroups. These motifs appear to be specific characteristics of the Myb subgroups. In contrast to the carboxy-terminal motifs specifically present in Myb genes, no conserved regulatory elements were identified in the noncoding regions.
Myb proteins used in the analysis
At the initiation of our project, the international rice genome sequencing project was not finished. Two finished rice genomes - one by Monsanto, the other by Syngenta (Oryza sativa L. ssp. japonica) - were not available to the public. Only the rice (indica) genome sequenced by Beijing Genomics Institute was publicly available. The sequence-quality assessments through sequence-tagged site (STS) markers, UniGene clusters and nonredundant cDNAs showed that 92% of the functional sequences that encode genes, and their immediate regulatory elements were present in the assembled sequences . Therefore, we chose subspecies indica rather than japonica for this study.
Rice genome sequences (scaffold dataset) were obtained from Beijing Genomics Institute . FGeneSH has been used successfully to predict genes in rice , and GenScan was used together with it to predict genes by taking rice genomic sequences as input. The two prediction results were combined as the complement of rice proteins. We performed Blastp and HMMER  searches to identify Myb genes from this rice protein dataset. For Blastp, we used a set of Myb R2R3 domains as query sequences. For HMMER, we used the Myb profile from Pfam. We parsed and combined the results of both searches, and obtained the final complement of rice Myb proteins with manual inspection of each sequence to confirm the identification of bona fide Myb genes. In the end, 85 typical Myb genes with complete R2R3 domains (one R0R1R2R3 and 84 R2R3) and 28 partial Myb genes were detected in the rice genome. Partial Myb genes contain a segment similar to one or a partial Myb repeat. The sequences of rice Myb genes are listed in Additional data file 6.
The complement of Arabidopsis proteins from GenBank were used to identify Myb proteins with complete R2R3 domains. The same methods as above were applied. We obtained 130 typical Myb proteins containing complete R2R3 domains (one R0R1R2R3 Myb, five R1R2R3 proteins, 124 R2R3 protein) and 11 partial Myb proteins. The results are consistent with previous findings .
To collect reference information on Myb gene functions, we used Blastp search against the nonredundant dataset in GenBank. The search yielded 43 plant Myb proteins with complete R2R3 domains; for most of these, some experimental information regarding functions or expression patterns was deposited by individual researchers.
Construction of phylogeny and subgroup designations
For phylogenetic analysis, the above 258 Myb proteins (130 Arabidopsis, 85 rice and 43 from various other plants) with complete R2R3 domains were included. The sequences were aligned by ClustalX (version 1.81). The phylogenetic tree was constructed by the neighbor-joining method using MEGA version 2.0 , with the setting of pairwise gap deletion and Poisson distance. Bootstrapping (1,000 replicates) was performed to evaluate the degree of support for a particular grouping in the neighbor-joining analysis. To enable the identification of motifs in the carboxy-terminal regions within each subgroup, we did not employ complete gap deletion as this may tend to exclude the contribution of carboxy-terminal residues because of their high divergence. The p-distance represents the simplest sort of genetic distance calculation and can be highly biased, so it was not used. In addition, attempts to use only the carboxy-terminal regions in construction of phylogeny were negative as a result of the high divergence. Therefore, we used the complete Myb proteins in clustering.
Three trees were constructed with the above settings, then taxa were classified into subgroups based on the topology of the phylogeny. Tree I used the 43 landmark Myb proteins with 130 Arabidopsis Myb proteins. The clustering result is consistent with the previous report ; that is, the taxa that were clustered as subgroups in Stracke et al.'s findings  are located within a subgroup in tree I. Tree II replaced the 130 Arabidopsis Myb proteins in tree I with 85 rice Myb proteins. Tree III used the total 258 Myb proteins. We found that the clustering result of Myb proteins from Arabidopsis, rice and the landmark Myb proteins was consistent among all three trees. Therefore, we used tree III as the representative in this study.
Within each subgroup, motifs were detected using MEME  with the following parameter settings: the distribution of motifs: zero or one per sequence; maximum number of motifs to find: 16; minimum width of motif: 6; maximum width of motif: 117, in order to identify long R2R3 domains; minimum number of sites for each motif: the number of sequences, i.e., the motif must be present in all members within the same subgroup. Other options used the default values. Only motifs with e-value <= 1e-10 were kept for further analysis.
Motif analysis: similarity scores and nonsynonymous (dN) substitution
To confirm the reliability of the 38 motif candidates identified by MEME, we used PlotSimilarity from GCG package from Genetics Computer Group, Inc. to calculate the similarity score of each motif plus its 10-residue flanking fragments (protein sequences). There were 33 motifs with values above the average score in the motif region and below the average score in the flanking regions, and these were tested further using the dN values. The program YN00 from PAML package  was applied to analyze the conservation of each motif plus its flanking regions (coding DNA sequences). The frequency of synonymous substitution is too high to detect the conservation. Therefore, nonsynonymous substitution value was calculated. Low dN values indicate conservation whereas high dN values indicate divergence. We detected 18 motifs with dN <0.5 in the motif region and >1 in flanking regions; these 18 motifs included 3 extension and 15 carboxy-terminal motifs.
Originally, MEME identified 38 motif candidates with e-value <= 1e-10. Then five motifs were removed in the similarity score test. Later, 15 motifs were discarded in the nonsynonymous substitution test. This result suggests that the similarity score test is not sufficiently powerful to determine the reliability of motif candidates, and may be safely ignored in the future.
Specificity of motifs
To test whether the carboxy-terminal motifs are specific to Myb genes, motif sequences were used to perform homology search in Swiss-Prot database and EST data set from GenBank. The latter can also provide information on the expression pattern of Myb genes. Low complexity was turned off for optimal short sequence search in both homology searches. In addition, in EST search, for motifs less than 15 residues 10 downstream residues were appended, and this elongated sequence was used as query sequence to perform EST search. The corresponding Myb R2 repeats were used in a tblastn EST search as an internal positive control.
The following additional files are available: Additional data file 1 gives the mapping relations of subgroups; Additional data file 2 gives the distribution of intron sizes in Myb genes; Additional data file 3 gives the previously identified carboxy-terminal motifs not included in this study; Additional data file 4 gives the Blastp search results for homologous Myb genes; Additional data file 5 gives the homologous EST search results; Additional data file 6 (a .FAS file) gives the sequences of all rice Myb genes; Additional data file 7 is a tree of the relationship of 130 Arabidopsis, 85 rice Myb proteins and 43 'landmark' Myb proteins; Additional data file 8 gives the intron-exon structure of all R2R3 domains; Additional data file 9 gives similarity scores and average dN values for all motifs; Additional data file 10 gives the distribution of pairwise dN values of motifs and flanking regions.
- Stracke R, Werber M, Weisshaar B: The R2R3-MYB gene family in Arabidopsis thaliana. Curr Opin Plant Biol. 2001, 4: 447-456. 10.1016/S1369-5266(00)00199-0.PubMedView ArticleGoogle Scholar
- Riechmann JL, Heard J, Martin G, Reuber L, Jiang C, Keddie J, Adam L, Pineda O, Ratcliffe OJ, Samaha RR, et al: Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. Science. 2000, 290: 2105-2110. 10.1126/science.290.5499.2105.PubMedView ArticleGoogle Scholar
- Rabinowicz PD, Braun EL, Wolfe AD, Bowen B, Grotewold E: Maize R2R3 Myb genes: sequence analysis reveals amplification in the higher plants. Genetics. 1999, 153: 427-444.PubMedPubMed CentralGoogle Scholar
- Ogata K, Morikawa S, Nakamura H, Sekikawa A, Inoue T, Kanai H, Sarai A, Ishii S, Nishimura Y: Solution structure of a specific DNA complex of the Myb DNA-binding domain with cooperative recognition helices. Cell. 1994, 79: 639-648.PubMedView ArticleGoogle Scholar
- Klempnauer KH, Gonda TJ, Bishop JM: Nucleotide sequence of the retroviral leukemia gene v-myb and its cellular progenitor c-myb: the architecture of a transduced oncogene. Cell. 1982, 31: 453-463. 10.1016/0092-8674(82)90138-6.PubMedView ArticleGoogle Scholar
- Martin C, Paz-Ares J: MYB transcription factors in plants. Trends Genet. 1997, 13: 67-73. 10.1016/S0168-9525(96)10049-4.PubMedView ArticleGoogle Scholar
- Rosinski JA, Atchley WR: Molecular evolution of the Myb family of transcription factors: evidence for polyphyletic origin. J Mol Evol. 1998, 46: 74-83.PubMedView ArticleGoogle Scholar
- Petroni K, Tonelli C, Paz-Ares J: The MYB transcription factor family: from maize to Arabidopsis. Maydica. 2002, 47: 213-232.Google Scholar
- Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, et al: A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science. 2002, 296: 79-92. 10.1126/science.1068037.PubMedView ArticleGoogle Scholar
- Kranz HD, Denekamp M, Greco R, Jin H, Leyva A, Meissner RC, Petroni K, Urzainqui A, Bevan M, Martin C, et al: Towards functional characterisation of the members of the R2R3-MYB gene family from Arabidopsis thaliana. Plant J. 1998, 16: 263-276. 10.1046/j.1365-313x.1998.00278.x.PubMedView ArticleGoogle Scholar
- Meissner RC, Jin H, Cominelli E, Denekamp M, Fuertes A, Greco R, Kranz HD, Penfield S, Petroni K, Urzainqui A, et al: Function search in a large transcription factor gene family in Arabidopsis: assessing the potential of reverse genetics to identify insertional mutations in R2R3 MYB genes. Plant Cell. 1999, 11: 1827-1840. 10.1105/tpc.11.10.1827.PubMedPubMed CentralView ArticleGoogle Scholar
- Jiang C, Gu J, Chopra S, Gu X, Peterson T: Ordered origin of the typical two- and three-repeated Myb genes. Gene. 2004, 326: 13-22. 10.1016/j.gene.2003.09.049.PubMedView ArticleGoogle Scholar
- Braun EL, Grotewold E: Newly discovered plant c-myb-like genes rewrite the evolution of the plant myb gene family. Plant Physiol. 1999, 121: 21-24. 10.1104/pp.121.1.21.PubMedPubMed CentralView ArticleGoogle Scholar
- Supplemental materials: index of /~czjiang/Myb. [http://www.public.iastate.edu/~czjiang/Myb]
- Rhoades MW, Reinhart BJ, Lim LP, Burge CB, Bartel B, Bartel DP: Prediction of plant microRNA targets. Cell. 2002, 110: 513-520. 10.1016/S0092-8674(02)00863-2.PubMedView ArticleGoogle Scholar
- Gilbert W: The exon theory of genes. Cold Spring Harb Symp Quant Biol. 1987, 52: 901-905.PubMedView ArticleGoogle Scholar
- Fedorova L, Fedorov A: Introns in gene evolution. Genetica. 2003, 118: 123-131. 10.1023/A:1024145407467.PubMedView ArticleGoogle Scholar
- Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV: Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol. 2003, 13: 1512-1517. 10.1016/S0960-9822(03)00558-X.PubMedView ArticleGoogle Scholar
- Lawrence CJ, Malmberg RL, Muszynski MG, Dawe RK: Maximum likelihood methods reveal conservation of function among closely related kinesin families. J Mol Evol. 2002, 54: 42-53.PubMedView ArticleGoogle Scholar
- Toledo-Ortiz G, Huq E, Quail PH: The Arabidopsis basic/helix-loop-helix transcription factor family. Plant Cell. 2003, 15: 1749-1770. 10.1105/tpc.013839.PubMedPubMed CentralView ArticleGoogle Scholar
- Athma P, Grotewold E, Peterson T: Insertional mutagenesis of the maize P gene by intragenic transposition of Ac. Genetics. 1992, 131: 199-209.PubMedPubMed CentralGoogle Scholar
- Menssen A, Hohmann S, Martin W, Schnable PS, Peterson PA, Saedler H, Gierl A: The En/Spm transposable element of Zea mays contains splice sites at the termini generating a novel intron from a dSpm element in the A2 gene. EMBO J. 1990, 9: 3051-3057.PubMedPubMed CentralGoogle Scholar
- Waites R, Selvadurai HR, Oliver IR, Hudson A: The PHANTASTICA gene encodes a MYB transcription factor involved in growth and dorsoventrality of lateral organs in Antirrhinum. Cell. 1998, 93: 779-789. 10.1016/S0092-8674(00)81439-7.PubMedView ArticleGoogle Scholar
- Tsiantis M, Schneeberger R, Golz JF, Freeling M, Langdale JA: The maize rough sheath2 gene and leaf development programs in monocot and dicot plants. Science. 1999, 284: 154-156. 10.1126/science.284.5411.154.PubMedView ArticleGoogle Scholar
- Oppenheimer DG, Herman PL, Sivakumaran S, Esch J, Marks MD: A myb gene required for leaf trichome differentiation in Arabidopsis is expressed in stipules. Cell. 1991, 67: 483-493.PubMedView ArticleGoogle Scholar
- Noda K, Glover BJ, Linstead P, Martin C: Flower colour intensity depends on specialized cell shape controlled by a Myb-related transcription factor. Nature. 1994, 369: 661-664. 10.1038/369661a0.PubMedView ArticleGoogle Scholar
- Lee MM, Schiefelbein J: WEREWOLF, a MYB-related protein in Arabidopsis, is a position-dependent regulator of epidermal cell patterning. Cell. 1999, 99: 473-483. 10.1016/S0092-8674(00)81536-6.PubMedView ArticleGoogle Scholar
- Paz-Ares J, Ghosal D, Wienand U, Peterson PA, Saedler H: The regulatory c1 locus of Zea mays encodes a protein with homology to myb proto-oncogene products and with structural similarities to transcriptional activators. EMBO J. 1987, 6: 3553-3558.PubMedPubMed CentralGoogle Scholar
- Nesi N, Jond C, Debeaujon I, Caboche M, Lepiniec L: The Arabidopsis TT2 gene encodes an R2R3 MYB domain protein that acts as a key determinant for proanthocyanidin accumulation in developing seed. Plant Cell. 2001, 13: 2099-2114. 10.1105/tpc.13.9.2099.PubMedPubMed CentralGoogle Scholar
- Quattrocchio F, Wing J, van der Woude K, Souer E, de Vetten N, Mol J, Koes R: Molecular analysis of the anthocyanin2 gene of petunia and its role in the evolution of flower color. Plant Cell. 1999, 11: 1433-1444. 10.1105/tpc.11.8.1433.PubMedPubMed CentralView ArticleGoogle Scholar
- Grotewold E, Athma P, Peterson T: Alternatively spliced products of the maize P gene encode proteins with homology to the DNA-binding domain of myb-like transcription factors. Proc Natl Acad Sci USA. 1991, 88: 4587-4591.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang P, Chopra S, Peterson T: A segmental gene duplication generated differentially expressed myb-homologous genes in maize. Plant Cell. 2000, 12: 2311-2322. 10.1105/tpc.12.12.2311.PubMedPubMed CentralView ArticleGoogle Scholar
- Aharoni A, De Vos CH, Wein M, Sun Z, Greco R, Kroon A, Mol JN, O'Connell AP: The strawberry FaMYB1 transcription factor suppresses anthocyanin and flavonol accumulation in transgenic tobacco. Plant J. 2001, 28: 319-332. 10.1046/j.1365-313X.2001.01154.x.PubMedView ArticleGoogle Scholar
- Borevitz JO, Xia Y, Blount J, Dixon RA, Lamb C: Activation tagging identifies a conserved MYB regulator of phenylpropanoid biosynthesis. Plant Cell. 2000, 12: 2383-2394. 10.1105/tpc.12.12.2383.PubMedPubMed CentralView ArticleGoogle Scholar
- Burns CG, Ohi R, Krainer AR, Gould KL: Evidence that Myb-related CDC5 proteins are required for pre-mRNA splicing. Proc Natl Acad Sci USA. 1999, 96: 13789-13794. 10.1073/pnas.96.24.13789.PubMedPubMed CentralView ArticleGoogle Scholar
- Ogata K, Hojo H, Aimoto S, Nakai T, Nakamura H, Sarai A, Ishii S, Nishimura Y: Solution structure of a DNA-binding unit of Myb: a helix-turn-helix-related motif with conserved tryptophans forming a hydrophobic core. Proc Natl Acad Sci USA. 1992, 89: 6428-6432.PubMedPubMed CentralView ArticleGoogle Scholar
- Dias AP, Braun EL, McMullen MD, Grotewold E: Recently duplicated maize R2R3 Myb genes provide evidence for distinct mechanisms of evolutionary divergence after duplication. Plant Physiol. 2003, 131: 610-620. 10.1104/pp.012047.PubMedPubMed CentralView ArticleGoogle Scholar
- Hare MP, Palumbi SR: High intron sequence conservation across three mammalian orders suggests functional constraints. Mol Biol Evol. 2003, 20: 969-978. 10.1093/molbev/msg111.PubMedView ArticleGoogle Scholar
- Rice GD. [http://btn.genomics.org.cn:8080/rice]
- Eddy SR: Hidden Markov models. Curr Opin Struct Biol. 1996, 6: 361-365. 10.1016/S0959-440X(96)80056-X.PubMedView ArticleGoogle Scholar
- Kumar S, Tamura K, Jakobsen IB, Nei M: MEGA2: molecular evolutionary genetics analysis software. Bioinformatics. 2001, 17: 1244-1245. 10.1093/bioinformatics/17.12.1244.PubMedView ArticleGoogle Scholar
- Bailey TL, Elkan C: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol. 1994, 2: 28-36.PubMedGoogle Scholar
- Yang Z, Nielsen R: Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol. 2000, 17: 32-43.PubMedView ArticleGoogle Scholar
- Joshi CP, Zhou H, Huang X, Chiang VL: Context sequences of translation initiation codon in plants. Plant Mol Biol. 1997, 35: 993-1001. 10.1023/A:1005816823636.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.