Rice genome revealed II: the japonica sequence
- Jean-Nicolas Volff
© BioMed Central Ltd 2002
Received: 26 April 2002
Published: 28 June 2002
Together with the draft sequence of the indica rice genome, the draft sequence of the japonica rice genome should advance understanding of this important crop plant
Significance and context
A team from the Swiss agricultural company Syngenta and the US biotechnology company Myriad Genetics has completed the draft sequence of the genome of Oryza sativaL. ssp. japonicaNipponbare, a rice cultivar very popular in Japan. The general background to the sequencing of the genome of rice (Oryza sativa) can be found in the Paper Report on the draft sequence of the O. sativassp indica by a Chinese team (Yu et al.) that was published in the same issue of Science(see related report - Genome Biology 3(7):reports0035).
Using the whole-genome shotgun strategy, Goff et al. determined 390 million bases (Mb) of high-quality genomic sequence disrupted by about 42,000 gaps. The estimated number of genes was between 32,000 and 50,000. Goff et al. make a distinction between predicted genes (minimal length 300 bp) with high (34.3%), medium (27.0%) and low (38.7%) confidence scores on the basis of similarity to known genes, protein motifs and predicted genes from other species. As observed in Arabidopsis thaliana, numerous paralogous genes (duplicates) are present in the rice genome. About 22% of genes with high and medium confidence scores were present as duplicates on the same chromosome (local duplications). There is also evidence of larger chromosome/genome duplications having arisen after the separation of the Arabidopsis and rice lineages. Of Arabidopsisgenes, 85% have at least one homolog in rice. Putative 'plant-specific' genes - that is, those present in both rice and Arabidopsis but apparently absent from the genomes of animals and microorganisms - were identified in Arabidopsis(8,000, or approximately 30% of predicted genes) and O. sativa ssp. japonica (13,000, approximately 20% of predicted genes with a minimal length of 300 bp). A substantial number of rice genes were not present in Arabidopsis or other organisms. The fact that most of these were classified as 'hypothetical' or 'unknown', or presented low confidence scores in rice, suggests that they may have been inaccurately predicted or may correspond to genes specific to particular plant lineages. Several classes of genes encoding families of proteins identified in other sequenced genomes (for example, nuclear steroid receptors, and the JAK and STAT signaling molecules) were not found in either the rice or Arabidopsis genomes. This study confirmed that gene content and order are highly conserved between rice and other cereals, but that conservation is much more limited between rice and Arabidopsis.
The research papers, news and commentaries about the sequencing of the rice genomes are available from The rice genome index page at the Science website. The draft japonicarice sequence reported by Goff et al.is available to academic researchers under certain conditions from the Torrey Mesa Research Institute - the rice genome page.
The restrictions over public access to Syngenta's data represent an unusual concession by the journal Science. As analysis of the two rice genomes proceeds, it will be interesting to see what differences there are between them.