Skip to main content

Table 2 Amino acids significantly preferred (-) or avoided (+) at 5' ends of exons across species

From: Finding exonic islands in a sea of non-coding sequence: splicing related constraints on protein composition and evolution are common in intron-rich genomes

Amino acids*†  
A C D E F G H I K L4 L2 M N P Q R4 R2 S4 S2 T V W Y Species (number of exons)
+2    -4 -5   +7 -3 -1   -2 -8 -6 +1 +4 +3 -7   +5 +6     Human (178,438)
+2    -4 -5   +7 -3 -1   -2 -7 -6 +1 +4 +3    +5 +6     Mouse (126,268)
         -2   -1    +2 +3 +1    +5 +4 -3    D. rerio (41,264)
  -3   +2   +4    +1       +5 -1 +3 -2   -4    -5 C. elegans (79,958)
  -5   +4      +3 -2 +2     +5 -1 +1 -3   -4   -6   C. briggsae (74,178)
           -1              A. gambiae (7,930)
+1       +3   -1   -3     +2    -2     -4   D. melanogaster (48,933)
+1   -3 -2    +4   -1    -4     +3   +2    -5 -6   A. mellifera (45,426)
                  +1 +3 +2     A. thaliana (109,900)
                        S. pombe (2,403)
                        S. cerevisiae (417)
  1. *Indices signify rank order of slope coefficients, separately for negative and positive trends. L2, R2, S2 and L4, R4, S4 signify the two-fold and four-fold degenerate blocks of leucine, arginine, and serine, respectively. S. cerevisiae terminal exons were retained given the small number of genes with more than one intron (eight).