Densities of different homopolymer amino acid repeats in D. purpureum and D. discoideum. (a) The density of each kind of amino acid repeat was calculated by summing the lengths of non-random repeats of that amino acid (Table S1 in Additional file 1) over protein sequences of all genes from D. purpureum and D. discoideum, dividing by the total length of coding sequence, and multiplying by 1,000. Letters indicate which amino acid each point represents. The Pearson's correlation coefficient between them is 0.997, P < 0.001. (b) Mean (± standard error) non-synonymous substitution rates (dNs) of genes with and without amino acid repeats. The non-synonymous substitution rates were calculated between orthologs (excluding repeat sequences) of D. purpureum and D. discoideum. Orthologs without amino acid repeats have significantly lower dN than orthologs with repeats in either D. discoideum and D. purpureum (Students t-test, both tests P < 0.0001). Error bars show standard errors of the means.