How many genes does it take to make a fly?
- Gilean McVean
© BioMed Central Ltd 2000
Received: 17 December 1999
Published: 17 March 2000
The largest contiguous stretch of DNA yet sequenced in Drosophila suggests that fruit-flies have only half as many genes as Caenorhabditis elegans.
Significance and context
In the 20th century, the fruit-fly Drosophila melanogaster has come to dominate the field of eukaryote genetics. For no other multicellular organism do we have as good an understanding of how genes interact to produce the whole organism. Long before the advent of molecular genetics, large numbers of genes had been mapped and characterized by the analysis of mutant phenotypes. It is perhaps ironic that in the era of genome projects, the sequencing of the Drosophila genome has lagged behind that of many other organisms. But with the publication of this paper - an analysis of about 2% of the total Drosophila genome - the complete sequence is on the horizon.
A 2.9 Mb region around the Adh gene in the middle of the left arm of chromosome 2 is described here. The region is well characterized by classical genetic analysis, and 73 loci had previously been identified from mutant alleles and overlapping deletions. In contrast, gene prediction programs identified a total of 218 genes, or a density of about one gene every 13 kb. This means that only about one in three genes in Drosophila have a visible mutant phenotype (and about one in four are vital). Because of the intensive research on Drosophila, this number is probably a reasonable estimate of the true total. A table of known and inferred gene functions is provided as Supplementary data to Genetics 153:179-219. Gene density in the Adh region is comparable to estimates from other locations, and predicts a total of about 9,000 genes for the entire genome. Remarkably, this is considerably less than the total of 19,090 now estimated for Caenorhabditis elegans. As in C. elegans, evolutionarily conserved genes that have homologs in distantly related species are more likely to have observable mutant phenotypes, and are expressed at higher levels (estimated by the proportion having matching expressed sequence tags, ESTs).
The main sources of information on the Drosophila genome project (DGP) are the Berkeley Drosophila Genome Project (BGDP), the European Drosophila Genome Project (EDGP) and FlyBase. The integration of Celera's shotgun sequencing approach with the maps and sequence from BDGP/EDGP is expected to be complete as early as February 2000.
That a century of research into the genetics of Drosophila has detected mutant phenotypes for only about a third of all genes presents a thrilling challenge for biology. From an evolutionary point of view it seems to suggest that all significant change results from the evolution of a minority of genes. It could also point to an enormous level of functional redundancy, and evolutionary flexibility. As the Drosophila melanogaster genome project nears completion, what we will need to address this question is the complete sequence of one of the 2,000 other species of Drosophila.