Skip to main content

Table 2 Amino acids significantly preferred (-) or avoided (+) at 5' ends of exons across species

From: Finding exonic islands in a sea of non-coding sequence: splicing related constraints on protein composition and evolution are common in intron-rich genomes

Amino acids*†

 

A

C

D

E

F

G

H

I

K

L4

L2

M

N

P

Q

R4

R2

S4

S2

T

V

W

Y

Species (number of exons)‡

+2

  

-4

-5

 

+7

-3

-1

 

-2

-8

-6

+1

+4

+3

-7

 

+5

+6

   

Human (178,438)

+2

  

-4

-5

 

+7

-3

-1

 

-2

-7

-6

+1

+4

+3

  

+5

+6

   

Mouse (126,268)

        

-2

 

-1

  

+2

+3

+1

  

+5

+4

-3

  

D. rerio (41,264)

 

-3

 

+2

 

+4

  

+1

     

+5

-1

+3

-2

 

-4

  

-5

C. elegans (79,958)

 

-5

 

+4

    

+3

-2

+2

   

+5

-1

+1

-3

 

-4

 

-6

 

C. briggsae (74,178)

          

-1

            

A. gambiae (7,930)

+1

     

+3

 

-1

 

-3

   

+2

  

-2

   

-4

 

D. melanogaster (48,933)

+1

 

-3

-2

  

+4

 

-1

  

-4

   

+3

 

+2

  

-5

-6

 

A. mellifera (45,426)

                 

+1

+3

+2

   

A. thaliana (109,900)

                       

S. pombe (2,403)

                       

S. cerevisiae (417)

  1. *Indices signify rank order of slope coefficients, separately for negative and positive trends. †L2, R2, S2 and L4, R4, S4 signify the two-fold and four-fold degenerate blocks of leucine, arginine, and serine, respectively. ‡S. cerevisiae terminal exons were retained given the small number of genes with more than one intron (eight).