Skip to main content

Table 1 Abridged table of mouse-human genome-wide codon conservation frequencies, as a function of each of the 20 × 20 pairs of aligned amino acids

From: COMIT: identification of noncoding motifs under selection in coding sequences

AA1

AA2

#

000

001

010

011

100

101

110

111

F

F

308,260

0.000

0.000

0.000

0.000

0.000

0.000

0.202

0.798

F

S

3,951

0.028

0.042

0.000

0.000

0.337

0.593

0.000

0.000

F

T

716

0.457

0.543

0.000

0.000

0.000

0.000

0.000

0.000

F

N

220

0.377

0.623

0.000

0.000

0.000

0.000

0.000

0.000

F

K

160

1.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

.

 

.

  

.

  

.

 

.

.

 

.

  

.

  

.

 

.

S

F

3,660

0.022

0.050

0.000

0.000

0.322

0.607

0.000

0.000

S

S

616,045

0.004

0.003

0.000

0.000

0.000

0.000

0.302

0.691

.

 

.

  

.

  

.

 

.

.

 

.

  

.

  

.

 

.

.

 

.

  

.

  

.

 

.

G

R

7,924

0.000

0.000

0.393

0.607

0.000

0.000

0.000

0.000

G

G

521,714

0.000

0.000

0.000

0.000

0.000

0.000

0.327

0.673

  1. For a given pair of amino acids, there are eight possible conservation patterns for the underlying nucleotides (000, 001, 010, 011, 100, 101, 110, 111), where 1 means a conserved base and 0 means a non-conserved base. These frequencies provide a null model for the expected conservation patterns at the nucleotide level, given the amino acid sequence. Here '#' indicates the number of instances in which amino acid 1 (AA1) is aligned to AA2 in the complete set of coding alignments between mouse and human.