Two different genomes that produce the same result
© BioMed Central Ltd. 2010
Published: 27 April 2010
Skip to main content
© BioMed Central Ltd. 2010
Published: 27 April 2010
Despite considerable differences in genomic sequence, the developmental program of gene expression between two similar Dictyostelium species is remarkably similar.
Have you ever wondered who determined the first DNA sequence? Or how hard it was? Well, I can't say it was the very first, but nearly 45 years ago, George Streisinger and his colleagues mutated the lysozyme gene of phage T4 with acridine, which they knew caused frameshifts, and then they caused a second site suppression (a mutation at a second site that suppressed the effect of the first) with another round of mutagenesis, restoring lysozyme activity. The amino acids encoded by the DNA between the two mutations should, in theory, have been changed - and they were. Knowing the changed amino acids and the genetic code, the group determined the actual DNA sequence. It was 23 nucleotides long and the complete study must have taken five people a year . There was an extra prize, however. The work confirmed that there were no 'commas' between codons. Reading the paper as a graduate student, I thought it was wonderful. And it was.
The two species are very similar in appearance and behavior, and the chemoattractant aggregation signal for both species is cyclic AMP (cAMP). D. purpureum makes the stalk of the fruiting body a little differently and the spore mass is purple (D. discoideum is light yellow) but that is about the extent of the obvious morphological differences. And yet the genome sequences are different - as different, according to Parikh et al. , as those of humans and bony fishes, despite the fact that D. discoideum and D. purpureum group within the same clade within the many species of social amoebae, according to phylogenies constructed from ribosomal RNA gene (rrnA) sequences . The overall sequence homology of the orthologues is 61.8%. Parikh et al.  find that the two genomes retain certain gross similarities - both are remarkably AT-rich - but the coding and intergenic sequences have diverged. The questions they then ask are: Do the two species retain the same programs of development despite the differences in genomes? Do the genes necessary to make spores or stalk cells turn on at the same time in each species? How many genes are orthologs; that is, similar by virtue of direct descent from the same ancestral gene? And how many genes are transcribed, and which genes are transcribed the most or the least?
To analyze and compare the transcriptomes of the two species, Parikh et al.  have abandoned the difficulties of microarray analysis in favor of RNA-sequencing (RNA-seq) . The latter method has a greater dynamic range and cross hybrididization is not the problem in RNAseq that it is in microarray analysis. Transcripts were collected at 4-hour intervals during the synchronous development of the fruiting body of each species and converted into cDNAs. Fragments of the cDNAs were sequenced in reads of 35 base pairs, and the reads mapped onto the genomes of D. discoideum or D. purpureum. Any transcript that did not map to a unique sequence was not counted, which will eliminate repetitive elements would be eliminated. This means that actin genes, of which there are a number, would not be counted, nor would the transcripts coding for the mysterious poly-asparagine tracts found in thousands of Dictyostelium proteins.
There is interesting data in the transcriptome analysis and the authors provide a nice tool, DictyExpress , to explore them, even for those not well versed in computational biology. The important finding is that among the transcripts that are mapped back to the two genomes, there are many orthologs - 7,619 to be exact (out of a predicted total of 12410 genes for D. purpureum and 13992 for D. discoideum) - and to a great extent they are transcribed in the same groups and in the same temporal order during development in the two species. Almost all genes are regulated during development, either up or down. The synchrony of development and the improved quantitation of RNA-seq (compared with microarrays) make these comparisons possible. Despite the differences in genome sequence, the regulation of developmental gene expression is maintained. Transcripts that are induced during development are coordinated with the slight differences in timing - D. purpureum takes 4 hours longer than D. discoideum to reach a particular developmental stage, and the appearance of the relevant transcripts is delayed as well. Many previously characterized genes are regulated almost identically in the two species.
What is the value of this molecular comparative anatomy? Some essential detail is perhaps lost in the statement of Parikh et al.  likening the difference between D. discoideum and D. purpureum genomes to the differences between the genomes of bony fish and humans. The differences in sequence between the two slime molds will surely not be spread evenly over the genomes. In structural genes, important functional elements of the protein sequence tend to be conserved, leaving other sequences to diverge. Occasionally, a lack of conservation can be telling - the cell-cell recognition proteins of different species, for example, might be expected to be species-specific and vary in discrete regions . Amazingly, the amoebae of these two species will co-aggregate because of their mutual chemotaxis towards higher levels of cAMP, but they subsequently sort out before forming a fruiting body, as Raper and Thom showed long ago .
But there is a long standing problem with Dictyostelium development and that concerns the responsible transcription factors - or rather their paucity . It has been known for years that development in Dictyostelium is accompanied by shifts in the expression patterns of many genes. In fact, it seems as if the cells switch from expressing one set of genes to expressing another, exactly at the time they switch from being unicellular to being multicellular. Parikh et al.  now show that the cells alter the abundance of almost every mRNA in the transcriptome during development, so one might expect that transcription factors would be central to the regulation of Dictyostelium development, as they are in Drosophila, for example. But this may not the case - Dictyostelium researchers have looked for developmental mutants by mutagenesis screens with restriction-enzyme-mediated mutagenesis (REMI), a form of insertional mutagenesis, for the past 18 years, but only a handful of the hundreds of mutants found are in canonical transcription factors. Of such transcription factors, two Mybs, one GATA, two bZIPs, CRTF and a STAT have been found, but a close correlation of any of these with any developmental program or coordinated gene expression in Dictyostelium has been elusive (see  for the roles of these factors and the phenotypes of their mutants). One exception is srfA, a trancriptional regulator similar in sequence to mammalian serum-response factor, whose loss by mutation results in the depression of transcripts involved in spore formation. D. discoideum and D. purpureum have the lowest known number of transcription factors relative to their genome size .
There are a number of possible explanations for these findings. One is that transcription factor genes have been mutated and associated developmental defects have been observed, but the gene products were not recognized as gene regulatory proteins because they had no homology with known transcription factors. A mutation in the D. discoideum G-box binding factor (GBF), for example, blocks post-aggregation development, but it is a non-canonical transcription factor. Another possibility is that the extraordinary conserved temporal expression of many orthologous transcripts in prestalk and prespore cells in the two species could be controlled by some means in addition to traditional transcription factors and recognition sites.
The exceptional AT-richness of promoter regions - 95% in most cases - invites comparison with another organism with a similarly sized AT-rich genome - Plasmodium falciparum. In this case too, transcriptional regulation has been difficult to study in detail, although recently a family of AP2 (Apicomplexan apetala2) transcription factors have been shown to be linked to sporozoite specific genes. These have weak homology with plant AP2 factors and, like GBF, bind sequences that have some GC content. Perhaps, with the exception of GBF and a few others, we are just not seeing the Dictyostelium transcription factors.
The extraordinary synchrony of development of Dictyostelium species and the quantitative advantages of RNA-seq are powerful partners, but such comparisons could be imagined in developing lineages within a particular species, such as different breeds of domesticated animals. How do the neural crest cells that make the snout of a greyhound differ from those of a bulldog? Is it just a few sequences that differ? Or a matter of transcript number? Is the transcript repertory the same but in one case there are more progenitors? These methods might be applied to find out. I am not suggesting sacrificing puppies (perhaps fish would be better subjects), but it is the kind of thing that Darwin would have liked to know.