Distribution of introns as a function of their length and insertion frame. (a) Introns are represented according to the three possible frames of the CDS. Phase 0 indicates that the intron is located between two codons, phase 1 indicates that it is located after the first nucleotide of a codon and phase 2 indicates that it is located after the second nucleotide of a codon. 'All introns' corresponds to the 1,083 introns, 'first introns' to the first intron of the 951 intron-containing genes and 'other introns' to the 131 second, third, fourth and fifth introns of genes. Differences between insertion phases were statistically significant for all introns (c2 = 64.68, P = 8.98e-15) or for the first introns (c2 = 60.68, P = 6.63e-14) but not for introns other than the first intron (c2 = 5.50, P = 0.063), probably due to their limited number. (b) The proportions of each of the four bases are represented for each base of the codons of the 6,449 protein-coding genes. Differences in nucleotide distribution were statistically significant for each position within the codon (c2 test, P << e-100). Stop codons were not considered. (c) Introns shown according to length categories, corresponding to a multiple of 3 (3n) or a multiple of 3 plus 1 nucleotides (3n + 1) or plus 2 nucleotides (3n + 2). There were 204 introns ≤60 nucleotides in length. The underrepresentation of 3n introns was statistically significant for all introns (c2 = 7.35, P = 0.025), first introns (c2 = 10.90, P = 0.004) and for introns no longer than 60 nucleotides (c2 = 6.70, P = 0.034). (d) Stop-free introns are represented according to their insertion frame and length category.