Sequence and structure of Brassica rapachromosome A3
- Jeong-Hwan Mun†1Email author,
- Soo-Jin Kwon†1,
- Young-Joo Seol1,
- Jin A Kim1,
- Mina Jin1,
- Jung Sun Kim1,
- Myung-Ho Lim1,
- Soo-In Lee1,
- Joon Ki Hong1,
- Tae-Ho Park1,
- Sang-Choon Lee1,
- Beom-Jin Kim1,
- Mi-Suk Seo1,
- Seunghoon Baek1,
- Min-Jee Lee1,
- Ja Young Shin1,
- Jang-Ho Hahn1,
- Yoon-Jung Hwang2,
- Ki-Byung Lim2,
- Jee Young Park3,
- Jonghoon Lee3,
- Tae-Jin Yang3,
- Hee-Ju Yu4,
- Ik-Young Choi5,
- Beom-Soon Choi5,
- Su Ryun Choi6,
- Nirala Ramchiary6,
- Yong Pyo Lim6,
- Fiona Fraser7,
- Nizar Drou7,
- Eleni Soumpourou7,
- Martin Trick7,
- Ian Bancroft7,
- Andrew G Sharpe8,
- Isobel AP Parkin9,
- Jacqueline Batley10,
- Dave Edwards11 and
- Beom-Seok Park1Email author
© Mun et al.; licensee BioMed Central Ltd. 2010
Received: 4 June 2010
Accepted: 27 September 2010
Published: 27 September 2010
The species Brassica rapa includes important vegetable and oil crops. It also serves as an excellent model system to study polyploidy-related genome evolution because of its paleohexaploid ancestry and its close evolutionary relationships with Arabidopsis thaliana and other Brassica species with larger genomes. Therefore, its genome sequence will be used to accelerate both basic research on genome evolution and applied research across the cultivated Brassica species.
We have determined and analyzed the sequence of B. rapa chromosome A3. We obtained 31.9 Mb of sequences, organized into nine contigs, which incorporated 348 overlapping BAC clones. Annotation revealed 7,058 protein-coding genes, with an average gene density of 4.6 kb per gene. Analysis of chromosome collinearity with the A. thaliana genome identified conserved synteny blocks encompassing the whole of the B. rapa chromosome A3 and sections of four A. thaliana chromosomes. The frequency of tandem duplication of genes differed between the conserved genome segments in B. rapa and A. thaliana, indicating differential rates of occurrence/retention of such duplicate copies of genes. Analysis of 'ancestral karyotype' genome building blocks enabled the development of a hypothetical model for the derivation of the B. rapa chromosome A3.
We report the near-complete chromosome sequence from a dicotyledonous crop species. This provides an example of the complexity of genome evolution following polyploidy. The high degree of contiguity afforded by the clone-by-clone approach provides a benchmark for the performance of whole genome shotgun approaches presently being applied in B. rapa and other species with complex genomes.
The Brassicaceae family includes approximately 3,700 species in 338 genera. The species, which include the widely studied Arabidopsis thaliana, have diverse characteristics and many are of agronomic importance as vegetables, condiments, fodder, and oil crops . Economically, Brassica species contribute to approximately 10% of the world's vegetable crop produce and approximately 12% of the worldwide edible oil supplies . The tribe Brassiceae, which is one of 25 tribes in the Brassicaceae, consists of approximately 240 species and contains the genus Brassica. The cultivated Brassica species are B. rapa (which contains the Brassica A genome) and B. oleracea (C genome), which are grown mostly as vegetable cole crops, B. nigra (B genome) as a source of mustard condiment, and oil crops, mainly B. napus (a recently formed allotetraploid containing both A and C genomes), B. juncea (A and B genomes), and B. carinata (B and C genomes) as sources of canola oil. These genome relationships between the three diploid species and their pairwise allopolyploid derivative species have long been known, and are described by 'U's triangle' .
B. rapa is a major vegetable or oil crop in Asia and Europe, and has recently become a widely used model for the study of polyploid genome structure and evolution because it has the smallest genome (529 Mb) of the Brassica genus and, like all members of the tribe Brassiceae, has evolved from a hexaploid ancestor [4–6]. Our previous comparative genomic study revealed conserved linkage arrangements and collinear chromosome segments between B. rapa and A. thaliana, which diverged from a common ancestor approximately 13 to 17 million years ago. The B. rapa genome contains triplicated homoeologous counterparts of the corresponding segments of the A. thaliana genome due to triplication of the entire genome (whole genome triplication), which occurred approximately 11 to 12 million years ago . Furthermore, studies in B. napus, which was generated in the last 10,000 years, have demonstrated that overall genome structure is highly conserved compared to its progenitor species, B. rapa and B. oleracea, which diverged approximately 8 million years ago, but significantly diverged relative to A. thaliana at the sequence level [7, 8]. Thus, investigation of the B. rapa genome provides substantial opportunities to study the divergence of gene function and genome evolution associated with polyploidy, extensive duplication, and hybridization. In addition, access to a complete and high-resolution B. rapa genome will facilitate research on other Brassica crops with partially sequenced or larger genomes.
Despite the importance of Brassica crops in plant biology and world agriculture, none of the Brassica species have had their genomes fully sequenced. Cytogenetic analyses have showed that the B. rapa genome is organized into ten chromosomes, with genes concentrated in the euchromatic space and centromeric repeat sequences and rDNAs arranged as tandem arrays primarily in the heterochromatin [9, 10]. The individual mitotic metaphase chromosome size ranges from 2.1 to 5.6 μm, with a total chromosome length of 32.5 μm . An alternative cytogenetic map based on a pachytene DAPI (4',6-diamidino-2-phenylindole dihydrochloride) and fluorescent in situ hybridization (FISH) karyogram showed that the mean lengths of ten pachytene chromosomes ranged from 23.7 to 51.3 μm, with a total chromosome length of 385.3 μm . Thus, chromosomes in the meiotic prophase stage are 12 times longer than those in the mitotic metaphase, and display a well-differentiated pattern of bright fluorescent heterochromatin segments. Sequencing of selected BAC clones has confirmed that the gene density in B. rapa is similar to that of A. thaliana in the order of 1 gene per 3 to 4 kb . Each of the gene-rich BAC clones examined so far by FISH (> 100 BACs) was found to be localized to the visible euchromatic region of the genome. Concurrently, a whole-genome shotgun pilot sequencing of B. oleracea with 0.44-fold genome coverage generated sequences enriched in transposable elements [12, 13]. Taken together, these data strongly point to a tractable genome organization where the majority of the B. rapa euchromatic space (gene space) can be sequenced in a highly efficient manner by a clone-by-clone strategy. Based on these results, the multinational Brassica rapa Genome Sequencing Project (BrGSP) was launched, with the aim of sequencing the euchromatic arms of all ten chromosomes . The project aimed to initially produce a 'phase 2 (fully oriented and ordered sequence with some small gaps and low quality sequences)' sequence with accessible trace files by shotgun sequencing of clones so that researchers who require complete sequences from a specific region can finish them.
To support genome sequencing, five large-insert BAC libraries of B. rapa ssp. pekinensis cv. Chiifu were constructed, providing approximately 53-fold genome coverage overall . These libraries were constructed using several different restriction endonucleases to cleave genomic DNA (EcoRI, BamHI, HindIII, and Sau3AI). Using these BAC libraries, a total of 260,637 BAC-end sequences (BESs) have been generated from 146,688 BAC clones (approximately 203 Mb) as a collaborative outcome of the multinational BrGSP community. The strategy for clone-by-clone sequencing was to start from defined and genetically/cytogenetically mapped seed BACs and build outward. Initially, a comparative tiling method of mapping BES onto the A. thaliana genome, combined with fingerprint-based physical mapping, along with existing genetic anchoring data provided the basis for selecting seed BAC clones and for creating a draft tiling path [6, 16, 17]. As a result, 589 BAC clones were sequenced and provided to the BrGSP as 'seed' BACs for chromosome sequencing. Integration of seed BACs with the physical map provided 'gene-rich' contigs spanning approximately 160 Mb. These 'gene-rich' contigs enabled the selection of clones to extend the initial sequence contigs. Here, as the first report of the BrGSP, we describe a detailed analysis of B. rapa chromosome A3, the largest of the ten B. rapa chromosomes, as assessed by both cytogenetic analysis and linkage mapping (length estimated as 140.7 cM). The A3 linkage group also contains numerous collinearity discontinuities (CDs) compared with A. thaliana, a recent study into which  revealed greater complexity than originally described for the segmental collinearity of Brassica and Arabidopsis genomes [19, 20]. In accordance with the agreed standards of the BrGSP, we aimed to generate phase 2 contiguous sequences for B. rapa chromosome A3. We annotated these sequences for genes and other characteristics, and used the data to analyze genome composition and examine consequential features of polyploidy, such as genome rearrangement.
Results and discussion
General features of chromosome A3
A total of 348 BAC clones were sequenced from the lower arm of chromosome A3 to produce 31.9 Mb of sequences of phase 2 or phase 3 (finished sequences) standard. These were assembled into nine contigs that span 140.7 cM of the genetic map (Figures 1c, d; Figure S1 in Additional file 1). The lower arm sequence starts at the proximal clone KBrH044B01 and terminates at the distal clone KBrF203I22 (Table S1 in Additional file 2). Excluding the gaps at the centromere and telomere, the pachytene spread FISH indicated that eight physical gaps, totaling approximately 2.3 Mb, remain on the pseudochromosome sequence. Despite extensive efforts, no BACs could be identified in those regions. The total length of the lower arm, from centromere to telomere, was therefore calculated to be 34.2 Mb. Thus, the 31.9 Mb of sequences we obtained represents 93% of the lower arm of the chromosome. The sequence and annotation of B. rapa chromosome A3 can be found in GenBank (see Materials and methods).
Characterization of the sequences
Statistics of B. rapa chromosome A3
B. rapachromosome A3
A. thalianawhole genome
Total number of BACs
Approximate chromosome length (Mb)
Total non-overlapping sequence (Mb)
G/C content (%)
Number of protein coding genes
Number of exons per gene
Intron size (bp)
Exon size (bp)
Average gene size (bp)
Average gene density (bp/gene)
Alternatively spliced genes
Average known gene size (bp)
Average unknown gene size (bp)
Average hypothetical gene size (bp)
Potential alternative splicing variants, based upon a minimum requirement for three EST matches, was identified for only 2.3% of the gene models. This finding suggests that alternative splicing may be rarer in B. rapa than it is in A. thaliana, where it occurs at a frequency of 16.9% . Additional EST data will enable more precise identification of alternative spliced variants on the B. rapa genome.
The chromosome contains 164 tRNAs and 3 small nuclear RNAs. The tRNAs are evenly distributed along chromosome A3 except for one region where they cluster. This cluster, at 23.9 Mb, contains 12 tandem tRNAPro genes, which are the most abundant tRNA genes on the chromosome (Figure S3 in Additional file 1). A tRNAPro cluster was previously detected also on A. thaliana chromosome 1 . A computational search coupled with prediction of secondary structure using reported mature microRNA (miRNA) sequences identified 26 miRNA genes, which outnumber the total number of B. rapa (17) recorded in miRBase (release 15.0; April 2010; Table S3 in Additional file 2). Abundant miRNAs on chromosome A3 included miR2111 and miR399. These have been implicated in regulating nutritional balance in B. rapa based upon observation of their induction during phosphate limitation in A. thaliana and rapeseed [25, 26].
A sequence similarity search showed that 2.5% of the genes identified on chromosome A3 are of mitochondrial (98 genes) or chloroplast (78 genes) origin. The widespread distribution observed for organellar insertions across the chromosome indicates that mitochondrial and chloroplast gene transfer occurred independently.
Synteny between chromosome A3 and the A. thalianagenome
Recombination and evolution of chromosome A3
Polyploid ancestry greatly complicates efforts to sequence genomes because of the presence of related sequences. Nevertheless, we have successfully sequenced, almost in its entirety, the largest chromosome of B. rapa, A3, using a clone-by-clone strategy. Annotation of the 31.9 Mb of sequences representing the gene space of chromosome A3 resulted in the development of models for 7,058 protein-coding genes and revealed the gene density to be only slightly lower than that observed for the related species A. thaliana, which is considered to have an exceptionally compact genome . Comparative analysis of collinear genome segments with A. thaliana revealed extensive chromosome-wide interspersed gene loss from B. rapa since divergence of the Brassica and Arabidopsis lineages, as described previously only for small genomic regions [5, 27, 28]. The alignment of genome segments that the whole chromosome sequence permitted, relative to both the A. thaliana genome and the inferred AK of a common progenitor of Brassica and Arabidopsis, enabled the development of a model for the derivation of chromosome A3. The results confirm that the complete genome sequence of B. rapa, provided that it is of an appropriate standard, will have a major impact on comparative genomics and gene discovery in Brassica species.
Materials and methods
The B. rapa chromosome A3 was sequenced using a clone-by-clone sequencing strategy with a BAC-based physical map framework that was genetically anchored to the B. rapa genome . We sequenced chromosome A3 of B. rapa ssp. pekinensis cultivar Chiifu from 348 overlapping BAC clones. Initially, we isolated seed BAC clones using a comparative BES tiling method and sequenced them by shotgun sequencing . Seed BAC clones were then extended in both directions by searching for sequence identity in the BES database, which was then cross-examined with a physical map constructed using the KBrH, KBrB, and KBrS1 BAC libraries . We also used KBrE and KBrS2 BAC libraries for additional extension and gap filling in particular. We carried out shotgun sequencing of the BAC clones to generate sequence data with eight- to ten-fold coverage of each clone using the ABI3730×l sequencer (Applied Biosystems, Foster City, CA, USA). According to the BrGSP , the minimal sequence goal was five phase 2 contigs. Individual BACs were assembled from the shotgun sequences using the PHRED/PHRAP [31, 32] and the Consed  programs. The sequence contig assembly was created based on overlapping sequences using Sequencher (Gene Codes, Ann Arbor, MI, USA) program. To evaluate the accuracy of the assembly, alignment of EST unigenes, PCR amplification of the assembled sequences, and sequence comparison with fosmid clone links were performed. Contigs were ordered using sequence tagged site markers mapping to the long arm of the chromosome using VCS and Jangwon linkage maps , followed by estimation of non-overlapping gaps between contigs based on the results of FISH experiments. Pseudochromosome sequences were created by connecting sequence contigs with addition of filler sequences according to the estimated gap size; 10 k addition for gap sizes < 100 kb or 100 k addition for gap sizes > 100 kb. All the sequence information has been deposited in the National Center for Biotechnology and Information (NCBI) with accession numbers [NCBI:AC189184] to [NCBI:AC241201] (Table S1 in Additional file 2).
We carried out gene prediction using our in-house automated gene prediction system . The assembled sequences were masked using RepeatMasker  based on a dataset combining the plant repeat element database of The Institute for Genomic Research , Munich Information Center for Protein Sequences , and our specialized database of B. rapa repetitive sequences. Gene model prediction was performed using EVidenceModeler . Putative exons and open reading frames (ORFs) were predicted ab initio using FGENESH , AUGUSTUS , GlimmerHMM , and SNAP  programs with the parameters trained using the B. rapa matrix. Putative gene splits predicted on the unfinished gaps were removed. To predict consensus gene structures, 152,253 B. rapa ESTs plus full-length cDNAs we have generated, A. thaliana coding sequences (release TAIR9), plant transcripts, and plant protein sequences were aligned to the predicted genes using PASA  and AAT  packages. The predicted genes and evidence sequences were then assembled according to the weight of each evidence type using EVidenceModeler. The highest scoring set of connected exons, introns, and noncoding regions was selected as a consensus gene model. Proteins encoded by gene models were searched against the Pfam database  and automatically assigned a putative name based on conserved domain hits or similarity with previously identified proteins. Annotated gene models were also searched against a database of plant transposon-encoded proteins . Predicted proteins with a top match to transposon-encoded proteins were excluded from the annotation and gene counts. Transfer RNAs were identified using tRNAscan-SE . To scan miRNA genes, the nonredundant miRNA sequences in miRBase v15 were mapped using BLASTN (up to two mismatches) . A search of potential precursor structures was performed by extracting the genomic context (400 bp upstream and downstream) surrounding the position of the miRNA sequence predicted and by analyzing those regions with Vienna RNA package . Only the putative pre-miRNA precursors with a folding energy lower than -20 kcal/mol were selected. Organellar insertions were determined using BLASTN with the A. thaliana mitochondrion and the B. rapa chloroplast genome sequence using a cutoff of 95% identity plus 90% coverage.
Comparative genome analysis
Syntenic regions between chromosome A3 of B. rapa and the A. thaliana genome were identified by a proteome comparison based on BLASTP analysis . The entire proteomes of the two genomes were compared, and only the top reciprocal BLASTP matches per chromosome pair were selected (minimum of 50% alignment coverage at a cutoff of < E-20). Chromosome scale synteny blocks were inferred by visual inspection of dot-plots using DiagHunter with parameters as described in the previous reports [6, 49]. At least four genes with the same respective orientations in both genomes were required to establish a primary candidate synteny block. To distinguish highly homologous real synteny blocks from false positives due to multiple rounds of polyploidy followed by genome rearrangement, we manually evaluated the degree of gene conservation in all the primary candidate blocks and selected real syntenic regions showing a gene conservation index of greater than 50% (the number of conserved matches divided by the total number of genes in the blocks). Self comparison of chromosome A3 with other chromosomes of the B. rapa genome was also conducted using seed BAC sequences .
bacterial artificial chromosome
Brassica rapa Genome Sequencing Project
expressed sequence tag
fluorescent in situ hybridization
nucleolar organizer region.
We thank the many participants in the Korean Brassica rapa Genome Project and Dr Xiaowu Wang of IVF, China for discussion. This work was supported by the National Academy of Agricultural Science (05-1-12-2-1 and PJ006759) and the BioGreen 21 Program (20050301034438), Rural Development Administration, Korea, the UK Biotechnology and Biological Sciences Research Council (BB/E017363), and the Australian Research Council (Projects LP0882095 and LP0883462).
- Beilstein MA, Al-Shehbaz IA, Kellogg EA: Brassicaceae phylogeny and trichome evolution. Am J Bot. 2006, 93: 607-619. 10.3732/ajb.93.4.607.PubMedView ArticleGoogle Scholar
- Selected vegetable production in leading countries and the world, 1998-2008. [http://www.ers.usda.gov/Publications/VGS/Tables/World.pdf]
- UN: Genome analysis in Brassica with special reference to the experimental formation of B. napus and peculiar mode of fertilization. Jpn J Bot. 1935, 7: 389-452.Google Scholar
- Johnston JS, Pepper AE, Hall AE, Chen ZJ, Hodnett G, Drabek J, Lopez R, Price HJ: Evolution of genome size in Brassicaceae. Ann Bot. 2005, 95: 229-235. 10.1093/aob/mci016.PubMedPubMed CentralView ArticleGoogle Scholar
- Yang TJ, Kim JS, Kwon SJ, Lim KB, Choi BS, Kim JA, Jin M, Park JY, Lim MH, Kim HI, Lim YP, Kang JJ, Hong JH, Kim CB, Bhak J, Bancroft I, Park BS: Sequence-level analysis of the diploidization process in the triplicated FLOWERING LOCUS C region of Brassica rapa. Plant Cell. 2006, 18: 1339-1347. 10.1105/tpc.105.040535.PubMedPubMed CentralView ArticleGoogle Scholar
- Mun JH, Kwon SJ, Yang TJ, Seol YJ, Jin M, Kim JA, Lim MH, Kim JS, Lee SI, Baek S, Choi BS, Kim DS, Kim N, Yu HJ, Lim KB, Lim YP, Bancroft I, Hahn JH, Park BS: Genome-wide comparative analysis of the Brassica rapa gene space reveals genome shrinkage and differential loss of duplicated genes after whole genome triplication. Genome Biol. 2009, 10: R111-10.1186/gb-2009-10-10-r111.PubMedPubMed CentralView ArticleGoogle Scholar
- Rana D, van den Boogaart T, O'Neill CM, Hynes L, Bent E, Macpherson L, Park JY, Lim YP, Bancroft I: Conservation of the microstructure of genome segments in Brassica napus and its diploid relatives. Plant J. 2004, 40: 725-733. 10.1111/j.1365-313X.2004.02244.x.PubMedView ArticleGoogle Scholar
- Cheung F, Trick M, Drou N, Lim YP, Park JY, Kwon SJ, Kim JA, Scott R, Pires JC, Paterson AH, Town C, Bancroft I: Comparative analysis between homoeologous genome segments of Brassica napus and its progenitor species reveals extensive sequence-level divergence. Plant Cell. 2009, 21: 1912-1928. 10.1105/tpc.108.060376.PubMedPubMed CentralView ArticleGoogle Scholar
- Lim KB, de Jong H, Yang TJ, Park JY, Kwon SJ, Kim JS, Lim MH, Kim JA, Jin M, Jin YM, Kim SH, Lim YP, Bang JW, Kim HI, Park BS: Characterization of rDNAs and tandem repeats in the heterochromatin of Brassica rapa. Mol Cells. 2005, 19: 436-444.PubMedGoogle Scholar
- Lim KB, Yang TJ, Hwang YJ, Kim JS, Park JY, Kwon SJ, Kim J, Choi BS, Lim MH, Jin M, Kim HI, de Jong H, Bancroft I, Lim YP, Park BS: Characterization of the centromere and peri-centromere retrotransposons in Brassica rapa and their distribution in related Brassica species. Plant J. 2007, 49: 173-183. 10.1111/j.1365-313X.2006.02952.x.PubMedView ArticleGoogle Scholar
- Koo DH, Plaha P, Lim YP, Hur Y, Bang JW: A high-resolution karyotype of Brassica rapa ssp. pekinensis revealed by pachytene analysis and multicolor fluorescence in situ hybridization. Theor Appl Genet. 2004, 109: 1346-1352. 10.1007/s00122-004-1771-0.PubMedView ArticleGoogle Scholar
- Ayele M, Haas BJ, Kumar N, Wu H, Xiao Y, Van Aken S, Utterback TR, Wortman JR, White OR, Town CD: Whole genome shotgun sequencing of Brassica oleracea and its application to gene discovery and annotation in Arabidopsis. Genome Res. 2005, 15: 487-495. 10.1101/gr.3176505.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang X, Wessler SR: Genome-wide comparative analysis of the transposable elements in the related species Arabidopsis thaliana and Brassica oleracea. Proc Natl Acad Sci USA. 2004, 101: 5589-5594. 10.1073/pnas.0401243101.PubMedPubMed CentralView ArticleGoogle Scholar
- Brassica info. [http://www.brassica.info/info/about-mbgp.php]
- The Korea Brassica rapa Genome Project. [http://www.brassica-rapa.org/BRGP/index.jsp]
- Mun JH, Kwon SJ, Yang TJ, Kim HS, Choi BS, Baek S, Kim JS, Jin M, Kim JA, Lim MH, Lee SI, Kim HI, Kim H, Lim YP, Park BS: The first generation of a BAC-based physical map of Brassica rapa. BMC Genomics. 2008, 9: 280-10.1186/1471-2164-9-280.PubMedPubMed CentralView ArticleGoogle Scholar
- Kim JS, Chung TY, King GJ, Jin M, Yang TJ, Jin YM, Kim HI, Park BS: A sequence-tagged linkage map of Brassica rapa. Genetics. 2006, 174: 29-39. 10.1534/genetics.106.060152.PubMedPubMed CentralView ArticleGoogle Scholar
- Trick M, Kwon SJ, Choi SR, Fraser F, Soumpourou E, Drou N, Wang Z, Lee SY, Yang TJ, Mun JH, Paterson AH, Town CD, Pires JC, Lim YP, Park BS, Bancroft I: Complexity of genome evolution by segmental rearrangement in Brassica rapa revealed by sequence-level analysis. BMC Genomics. 2009, 10: 539-10.1186/1471-2164-10-539.PubMedPubMed CentralView ArticleGoogle Scholar
- Parkin IA, Gulden SM, Sharpe AG, Lukens L, Trick M, Osborn TC, Lydiate DJ: Segmental structure of the Brassica napus genome based on comparative analysis with Arabidopsis thaliana. Genetics. 2005, 171: 765-781. 10.1534/genetics.105.042093.PubMedPubMed CentralView ArticleGoogle Scholar
- Schranz ME, Lysak MA, Mitchell-Olds T: The ABC's of comparative genomics in the Brassicaceae: building blocks of crucifer genomes. Trends Plant Sci. 2006, 11: 535-542. 10.1016/j.tplants.2006.09.002.PubMedView ArticleGoogle Scholar
- Gupta V, Jagannathan V, Lakshmikumaran MS: A novel AT-rich tandem repeat of Brassica nigra. Plant Sci. 1990, 68: 223-229. 10.1016/0168-9452(90)90228-G.View ArticleGoogle Scholar
- The Arabidopsis Genome Initiative: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.View ArticleGoogle Scholar
- The Arabidopsis Information Resource. [http://www.arabidopsis.org/portals/genAnnotation]
- Theologis A, Ecker JR, Palm CJ, Federspiel NA, Kaul S, White O, Alonso J, Altafi H, Araujo R, Bowman CL, Brooks SY, Buehler E, Chan A, Chao Q, Chen H, Cheuk RF, Chin CW, Chung MK, Conn L, Conway AB, Conway AR, Creasy TH, Dewar K, Dunn P, Etgu P, Feldblyum TV, Feng J, Fong B, Fujii CY, Gill JE, et al: Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana. Nature. 2000, 408: 816-820. 10.1038/35048500.PubMedView ArticleGoogle Scholar
- Hsieh LC, Lin SI, Shih AC, Chen JW, Lin WY, Tseng CY, WH L, Chiou TJ: Uncovering small RNA-mediated responses to phosphate deficiency in Arabidopsis by deep sequencing. Plant Physiol. 2009, 151: 2120-2132. 10.1104/pp.109.147280.PubMedPubMed CentralView ArticleGoogle Scholar
- Pant BD, Musialak-Lange M, Nuc P, May P, Buhtz A, Kehr J, Walther D, Scheible WR: Identification of nutrient-responsive Arabidopsis and rapeseed microRNAs by comprehensive real-time polymerase chain reaction profiling and small RNA sequencing. Plant Physiol. 2009, 150: 1541-1555. 10.1104/pp.109.139139.PubMedPubMed CentralView ArticleGoogle Scholar
- O'Neill CM, Bancroft I: Comparative physical mapping of segments of the genome of Brassica oleracea var. alboglabra that are homoelogous to sequenced regions of chromosomes 4 and 5 of Arabidopsis thaliana. Plant J. 2000, 23: 233-243. 10.1046/j.1365-313x.2000.00781.x.PubMedView ArticleGoogle Scholar
- Town CD, Cheung F, Maiti R, Crabtree J, Haas BJ, Wortman JR, Hine EE, Althoff R, Arbogast TS, Tallon LJ, Vigouroux M, Trick M, Bancroft I: Comparative genomics of Brassica oleracea and Arabidopsis thaliana reveal gene loss, fragmentation, and dispersal after polyploidy. Plant Cell. 2006, 18: 1348-1359. 10.1105/tpc.106.041665.PubMedPubMed CentralView ArticleGoogle Scholar
- Lysak MA, Berr A, Pecinka A, Schmidt R, McBreen K, Schubert I: Mechanisms of chromosome number reduction in Arabidopsis thaliana and related Brassicaceae species. Proc Natl Acad Sci USA. 2006, 103: 5224-5229. 10.1073/pnas.0510791103.PubMedPubMed CentralView ArticleGoogle Scholar
- The Brassica Genome Project. [http://brassica.bbsrc.ac.uk/brassica_genome_sequencing_concept.htm]
- Ewing B, Hillier L, Wendl M, Green P: Basecalling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 1998, 8: 175-185.PubMedView ArticleGoogle Scholar
- Ewing B, Green P: Basecalling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 1998, 8: 186-194.PubMedView ArticleGoogle Scholar
- Gordon D, Abajian C, Green P: Consed: a graphical tool for sequence finishing. Genome Res. 1998, 8: 195-202.PubMedView ArticleGoogle Scholar
- RepeatMasker. [http://www.repeatmasker.org/]
- Ouyang S, Buell CR: The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants. Nucleic Acid Res. 2004, 32: D360-366. 10.1093/nar/gkh099.PubMedPubMed CentralView ArticleGoogle Scholar
- Munich Information Center for Protein Sequences. [http://www.helmholtz-muenchen.de/en/ibis]
- Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR: Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 2008, 9: R7-10.1186/gb-2008-9-1-r7.PubMedPubMed CentralView ArticleGoogle Scholar
- FGENESH. [http://www.softberry.com]
- Stanke M, Morgenstern B: AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acid Res. 2005, 33: W465-W467. 10.1093/nar/gki458.PubMedPubMed CentralView ArticleGoogle Scholar
- Majoros WH, Pertea M, Salzberg SL: TigrScan and GlimmerHMM: two open-source ab initio eukaryotic gene-finders. Bioinformatics. 2004, 20: 2878-2879. 10.1093/bioinformatics/bth315.PubMedView ArticleGoogle Scholar
- Korf I: Gene finding in novel genomes. BMC Bioinformatics. 2004, 5: 59-10.1186/1471-2105-5-59.PubMedPubMed CentralView ArticleGoogle Scholar
- Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD, Salzberg SL, White O: Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acid Res. 2003, 31: 5654-5666. 10.1093/nar/gkg770.PubMedPubMed CentralView ArticleGoogle Scholar
- Huang X, Adams MD, Zhou H, Kerlavage AR: A tool for analyzing and annotating genomic sequences. Genomics. 1997, 46: 37-45. 10.1006/geno.1997.4984.PubMedView ArticleGoogle Scholar
- Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL: The Pfam protein families database. Nucleic Acid Res. 2002, 30: 276-280. 10.1093/nar/30.1.276.PubMedPubMed CentralView ArticleGoogle Scholar
- Plant Transposon-encoded Protein Database. [ftp://ftp.tigr.org/pub/data/TransposableElements/transposon_db.pep]
- Lowe TM, Eddy SR: tRNAscan-SE: a program for inproved detection of transfer RNA genes in genomic sequence. Nucleic Acid Res. 1997, 25: 955-964. 10.1093/nar/25.5.955.PubMedPubMed CentralView ArticleGoogle Scholar
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acid Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralView ArticleGoogle Scholar
- Gruber AR, Lorenz R, Bernhart SH, Neuböck R, Hofacker IL: The Vienna RNA websuite. Nucleic Acid Res. 2008, 36: W70-74. 10.1093/nar/gkn188.PubMedPubMed CentralView ArticleGoogle Scholar
- Cannon SB, Kozik A, Chan B, Michelmore R, Young ND: DiagHunter and GenoPix2D: programs for genomic comparisons, large-scale homology discovery and visualization. Genome Biol. 2003, 4: R68-10.1186/gb-2003-4-10-r68.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.