Skip to main content

Table 1 Interkingdom domain fusions and their probable origins

From: Interkingdom gene fusions

IKF gene

Best 'native' hit

Best 'alien' hit

Protein function

Stand-alone

Comment

(GI number and gene

(E-value, amino acid

(E-value, amino

 

paralog of the

 

name) and origin

residue range,

acid residue range,

 

alien domain

 

of domains

species)/domain

species)/domain

   
 

function

function

   

Archaea

     

   Aeropyrum pernix

     

5106104_

2621953_Mth

2633525_Bs

Hydroxymethyl-

None

Pyrococci encode proteins with

APE2400

5e-27;

4e-54;

pyrimidine

 

the same domain organization

Archaeal-bacterial

282-445;

16-272;

phosphate kinase

 

andclosest similarity to A. pernix;

 

uncharacterized domain

hydroxymethyl-

involved in thiamine

 

M. jannaschii encodes a protein

 

conserved among

pyrimidine phosphate

biosynthesis

 

with the same domain

 

archaea (homolog

kinase

(additional function?)

 

organization but low similarity;

 

of the amino-terminal

   

Mt encodes a HMP-kinase with

 

domain of sialic acid

   

moderate similarity

 

synthase)

    

   Methanococcus jannaschii

     

1591138_

2128140_Mj;

7270033_At;

Unknown;

None

The amino-terminal domain is

MJ0434

1e-19;

0.003;

possible role

 

present in several stand-alone

Archaeal-

2-94;

120-222;

in stress response

 

copies in M. jannaschii, but

bacterial-eukaryotic

uncharacterized

AIG2-like

  

otherwise, is seen mostly in

 

domain

stress-related

  

bacteria; the possibility of

  

protein

  

acquisition of a bacterial gene

     

by the Methanococcus lineage

     

is conceivable

   Methanobacterium thermoautotrophicum

     

2621249_

5103547_Ap;

1651798_Ssp;

Membrane-associated

None

In Ssp, the amino-terminal

MTH204

1e-34;

0.002;

5-formyl-

 

domain is fused to another

Archaeal-

137-326;

8-139;

tetrahydrofolate

 

uncharacterized domain. An

eukaryotic/

5-formyl-

uncharacterized

cyclo-ligase(?);

 

ortholog with conserved

bacterial

tetrahydrofolate

membrane-associated

exact function

 

domain organization is seen

 

cyclo-ligase

domain

unknown

 

in Mycobacterium, but many

     

other bacteria encode stand-

     

alone versions of this domain,

     

which could be the actual sources

     

of horizontal gene transfer

2621673_

3256572_Ph;

2984130_Aa;

GTPase, possible

2621855

 

MTH594

3e-10;

6e-19;

role in signal

  

Archaeal-bacterial

5-137;

233-390;

transduction

  
 

inactivated RecA

GTPase

   
 

domain

    

2622642_

5105992_Ap;

2569943_Axy;

Glucose-1-phosphate

None

 

MTH1523

3e-36;

2e-05;

thymidylyl transferase/

  

Archaeal-bacterial

5-226;

226-334;

glucose-6-phosphate

  
 

glucose-1-phosphate

mannose-6-

isomerase

  
 

thymidylyl transferase

phosphate isomerase

   

   Bacteria

     

Aquifex aeolicus

     

2983622_

2633696_Bs;

2650176_Af;

Signal

None

 

aq_1151

5e-65;

0.005;

transduction

  

Bacterial-archaeal

325-795;

116-279;

c-di-GMP

  
 

c-di-GMP phospho-

PAS/PAC

phospho-diesterase

  
 

diesterase

domain

   

2984285_

586875_Bs;

3915955_Mj;

Molybdenum

None

 

aq_2060

4e-63

3e-09;

cofactor

  

Bacterial-archaeal

1-252;

270-441;

bisynthesis enzyme(?)

  
 

PHP superfamily

pyruvate

   
 

hydrolase

formate-lyase

   
  

activating enzyme

   
  

(Fe-S cluster

   
  

oxidoreductase)

   

   Bacillus subtilis

     

2632283_yaaH,

4980914_Tm

399377_Rn

Chitinase

2635915

B. subtilis encodes two

1945087_ydhD

1e-06

2e-11

  

paralogous proteins with the

Bacterial-eukaryotic

2-92;

221-402;

  

same domain architecture

 

LysM repeat domain

chitinase

   

2633242_yhcR

645819_Dr;

2622704_Mth;

Nuclease-nucleotidase

None

 

Bacterial-archaeal

1e-64;

0.008

(probable repair

  
 

584-1068;

151-257;

enzyme)

  
 

5'-nucleotidase;

nucleic acid-binding

   
 

1175987_

domain (OB-fold)

   
 

ECR100;

    
 

2e-09;

    
 

377-521;

    
 

thermonuclease

    

2632325_yabN

4981449_Tm;

3873806_Ce;

Methyl-transferase/

None

Other than in chlamydiae,

Bacterial-eukaryotic

2e-62;

0.003;

pyro-phosphatase

 

the SWI domain is seen

 

223-483;

7-125;

(metabolic enzyme

 

in eukaryotic chromatin-

 

MazG (predicted pyro-

SAM-dependent

of an unknown

 

associated proteins, leading

 

phosphatase)

methyl-transferase

pathway?)

 

to the suggestion that

     

chlamydial topoisomerase

     

is involved in chromosome

     

condensation

   Chlamydophyla pneumoniae

     

4377077_

730965_Bs;

3581917_Sp;

DNA topoisomerase I,

7189103

SWI is a typical eukaryotic

CPn0769

e-148;

3e-10;

possibly involved in

 

domain not found in

Bacterial-eukaryotic

1-727;

792-866;

chromatin

 

prokaryotes other than

 

DNA topoisomerase I

SWI domain

condensation

 

chlamydia (the ortholog

     

in Chlamydia trachomatis has the

     

same domain architecture)

   Deinococcus radiodurans

     

6459294_

7248325_Sco;

6754878_Mm;

DNase

None

The G9a domain is not

DR1533

0.001;

9e-28;

  

detectable in other prokaryotes.

Bacterial-eukaryotic

171-265;

4-148;

  

In eukaryotes, this domain so

 

McrA family

G9a domain (DNA-

  

far has been found only as part

 

endonuclease

binding?)

  

of multidomain nuclear proteins,

     

including transcription factors

   Escherichia coli

     

1787179_

94933_Ppu;

3747107_Rn;

Oxidoreductase

None

The eukaryotic domain is present

b0947

3e-10;

3e-32;

  

(as a partial sequence) also in the

Bacterial-eukaryotic

287-367;

4-261;

  

beta-proteobacterium Vogesella.

 

ferredoxin

uncharacterized

  

This domain contains a conserved

  

domain (thiol

  

pair of cysteines, which together

  

oxidoreductase?)

  

with the ferredoxin fusion, may

     

suggest a thiol oxidoreductase

     

activity. Most of the eukaryotic

     

proteins containing this domain

     

appear to be mitochondrial,

     

suggesting the possibility of an

     

alternative evolutionary scenario

1787678_

487713_Sli;

5459012_Pab;

Methyl-transferase/

None

 

b1410

3e-05;

1e-17;

Lipase (exact function

  

Bacterial-archaeal

408-522;

33-274;

unclear)

  
 

SAM-dependent

lyso-phospholipase

   
 

methyl-transferase

    

1787679_ynbD

1591375_Mj;

7160233_Sp;

Membrane-associated

None

An unusual case of fusion

Archaeal-eukaryotic

4e-04;

1e-06;

bifunctional

 

between an apparently archaeal

 

50-218;

346-415;

phosphatase

 

and a typical eukaryotic domain

 

membrane-associated

tyrosine phosphatase

  

in a bacterium

 

acid phosphatase

    

1788589_

5763950_Sco;

3860247_At;

Bifunctional enzyme;

None

 

b2255

4e-35;

1e-55;

exact function unclear

  

Bacterial-eukaryotic

1-259;

318-652;

   
 

methionyl-tRNA

dTDP-glucose 4-6-

   
 

formyl-transferase

dehydratase

   

1788938_yfiQ

929735_Nsp;

2649370_Af;

acetyl-CoA synthetase/

None

 

bacterial-Archaeal/

8e-32;

4e-85;

acetyl-transferase; exact

  

eukaryotic

637-874;

6-689;

function unclear

  
 

acetyl-transferase

acetyl-CoA synthetase

   

   Mycobacterium tuberculosis

     

2909507_

6469244_Sco;

4151109_Tbr;

Adenylate cyclase/

7476546,

M. tuberculosis encodes three

Rv2488c,

5e-64;

6e-04;

ATPase; probable

7476738

paralogous proteins that consist

2791528_Rv1358,

19-603;

6-167;

transcription regulator

 

of three domains, the eukaryotic-

1419061_

4726088_Rer;

adenylate cyclase

  

type adenylate cyclase, AP

Rv1358

2e-12;

   

(apoptotic) ATPase and DNA-

Bacterial-eukaryotic

818-1073

   

binding response regulator, and

     

two stand-alone versions of

     

adenylate cyclase, which show the

     

closest similarity to the cyclase

     

domain of the multidomain

     

proteins

1314025_

120037_Tt;

178213_Hs;

Ferredoxin/

2076681

D. radiodurans also encodes the

Rv0886

1e-11;

4e-65;

ferredoxin reductase

 

eukaryotic-type ferredoxin

Bacterial-eukaryotic

2-79;

93-543;

  

reductase, but the ferredoxin

 

ferredoxin

ferredoxin reductase

  

fusion is unique to mycobacteria

3261732_

2661695_Sco;

279520_Dd;

cAMP-dependent

4455714

 

Rv0998

3e-13;

7e-07;

acetyl-transferase(?)

(M. leprae)

 

Bacterial-eukaryotic

148-328;

30-105;

   
 

acetyl-transferase

cAMP-binding domain

   

2326726_

421331_Cvi;

2645721_Mm;

Bifunctional enzyme of

1929080

 

Rv1683

1e-24;

6e-26;

poly (3-hydroxy-butyrate)

  

Bacterial-eukaryotic

23-359;

456-972;

synthesis

  
 

poly (3-hydroxy-

very-long-chain

   
 

butyrate) synthase

acyl-CoA synthetase

   

1403447_

6752338_Sco;

3892714_At;

Polyfunctional enzyme

2661651

In this protein, the domain of

Rv2006

2e-27;

8e-27;

of trehalose metabolism

 

apparent eukaryotic origin

Bacterial-eukaryotic

23-240;

264-521;

  

is flanked by bacterial domains

 

phosphatase;

trehalose-6-phosphate

  

from both sides

 

6448751_Sco;

phosphatase

   
 

0.0;

    
 

534-1320;

    
 

trehalose hydrolase

    

2896788_

117648_Ec;

3073773_Mm;

Polyfunctional enzyme

2337823

The presence of the stand-alone

Rv2051c

1e-16;

4e-31;

of lipid metabolism

(M. leprae);

version of the eukaryotic

Bacterial-eukaryotic

94-514;

588-829;

 

6468712

domain in Streptomyces suggests

 

apolipoprotein

dolichol-phosphate-

 

(Streptomyces

an ancient horizontal transfer

 

N-acyltransferase

mannose synthase

 

coelicolor)

 

2791523_

6225563_Scy;

1098605_Cnu;

Multifunctional enzyme

None

 

Rv2483c

7e-16;

5e-22;

of phospholipid

  

Bacterial-eukaryotic

36-253;

289-492;

metabolism

  
 

phosphoserine

1-acyl-sn-

   
 

phosphatase

glycerol-3-phosphate

   
  

acyltransferase

   

2894233_

2633801_Bs;

4538974_At;

Molybdopterin synthase

2076687

The same domain organization

Rv3323c

3e-19;

7e-06;

  

is seen in D. radiodurans, but in

Bacterial-eukaryotic

89-208;

2-82;

  

this case, both components

 

molybdopterin

molybdopterin

  

appear to be of bacterial origin

 

synthase large subunit

synthase small subunit

   
 

(MoaE)

(MoaD)

   

2960152_

4753872_Sco;

466119_Ce;

cAMP-regulated

2501688

M. tuberculosis encodes two

Rv3728,

1e-35;

7e-20;

efflux pump(?)

 

strongly similar paralogs with

7477551_

56-428;

549-964;

  

the same domain architecture

Rv3239c

transmembrane

cAMP-binding domain-

   

Bacterial-eukaryotic

efflux protein

phosphoesterase

   

2960153_

4731342_Sl;

1591330_Mj;

Bifunctional enzyme

1806159

The amino-terminal domain

Rv3729

3e-14;

3e-58;

of molybdenum

 

stand-alone paralog is more

Bacterial-archaeal

510-776;

molybdenum

cofactor biosynthesis

 

similar to archaeal homologs

 

C5-O-methyl-

cofactor biosynthesis

  

than to the stand-alone paralog,

 

Transferase

protein MoaA

  

but nevertheless, the latter

 

(mitomycin

(Fe-S oxidoreductase)

  

appears to be of archaeal origin

 

biosynthesis)

    

3261806_

40487_Cg;

7304009_Dm;

Secreted protein

7649504

The stand-alone version of the

Rv3811

3e-12;

2e-12;

 

(S. coelicolor)

eukaryotic domain is present

Bacterial-eukaryotic

404-494;

198-384;

  

only in Streptomyces

 

major secreted

peptidoglycan

   
 

protein

recognition protein

   

   Treponema pallidum

     

3322964_

7225946_Nm;

320868_Sc;

Uridine kinase

None

A co-linear ortholog is present

TP0667

9e-04;

2e-13;

  

in Thermotoga

Bacterial-eukaryotic

10-154;

290-488;

   
 

threonyl-tRNA

uridine kinase

   
 

synthetase (TGS and

    
 

H3H domains)

    

   Thermotoga maritima

     

4981276_

68516_Bs;

3218401_Sp;

Uridine kinase

None

A co-linear ortholog is present

TM0751

3e-07;

2e-11;

  

in Treponema

Bacterial-eukaryotic

11-200;

288-475;

   
 

threonyl-tRNA

uridine kinase

   
 

synthetase (TGS and

    
 

H3H domains)

    

Eukaryotes

     

   Saccharomyces cerevisiae

     

536367_

586134_Bt;

7450047_Aa;

Bifunctional signal-

5249

SurE homologs are not

Ybr094w

9e-10;

8e-09;

transduction protein

(Yarrowia

detectable in eukaryotes other

Eukaryotic/

tubulin-tyrosine ligase

acid phosphatase

 

lipolytica)

than yeasts

Bacterial-archaeal

 

(SurE)

   

1431219_

577625_Hs;

3328426_Ct

   

YDL141w

1e-39

5e-27;

   

Eukaryotic-

Biotin-[propionyl-

biotin protein ligase

Bifunctional biotin-

None

An ortholog with an identical

bacterial

CoA-carboxylase(ATP-

 

protein ligase

 

domain architecture is present

 

hydrolysing)] ligase

   

in S. pombe

458922_

477096_Gg;

1653075_Ssp;

heat shock

NONE

An ortholog with an identical

YHR206W

8e-18;

7e-17;

transcription

 

domain architecture is present

Eukaryotic-bacterial

78-216

375-503;

factor

 

in S. pombe (3327019)

 

heat shock

CheY domain

   
 

transcription factor

    
 

domain

2983676_Aa;

Siroheme synthase

2330809

S. pombe also encodes a co-linear

486539_

1146165_At;

1e-04;

 

(S. pombe)

ortholog (3581882); apparent

YKR069w

3e-34;

22-188;

  

displacement of the bacterial

Eukaryotic-bacterial

249-556;

precorrin-2 oxidase

  

precorrin-2 oxidase by a distinct

 

urophorphyrin III

   

Rossmann fold domain

 

methylase

    

1302305_

4938476_At;

3212189_Hi;

Multifunctional enzyme

None

Co-linear orthologs in S. pombe

YNL256w

5e-65;

5e-05;

of folate biosynthesis

 

(7490442) and Pneumocystis

Eukaryotic-bacterial

324-861

62-148;

  

carinii (283062)

 

7,8-dihydro-6-

187-297;

   
 

hydroxymethylpterin-

dihydro-neopterin

   
 

pyro-phosphokinase+

aldolase

   
 

Dihydro-pteroate

    
 

synthase

    

1419887_

7297709_Dm;

5918510_Sco;

Bifunctional RNA

2213559

The known bacterial homologs

YOL066c

2e-72;

2e-10;

modification enzyme

(S. pombe)

have a two-domain organization;

Eukaryotic-bacterial

42-408;

436-574;

  

the evolutionary scenario could

 

large ribosomal

pyrimidine deaminase

  

have included domain

 

subunit pseudoU

   

rearrangements

 

synthase

    

1419865_

2462827_At;

1075360_Hi;

Transcriptional regulator None

 

Yeast encodes three strongly

YOL055c,

1e-39;

6e-24;

of thiamine biosynthesis

 

similar paralogs with identical

2132251_

22-390;

342-549;

genes(?)

 

domain organization; co-linear

YPL258c,

phosphomethyl

transcriptional

  

orthologs are present in other

2132289_

pyrimidinekinase

activator

  

ascomycetes

YPR121w

(thiamine biosynthesis)

    

Eukaryotic-bacterial

     

1370444_ YPL214c

2746079_Bn;

2648451_Af;

Bifunctional thiamine

None

Except for the one from

Eukaryotic-archaeal/

1e-27;

9e-27;

biosynthesis enzyme

 

A. fulgidus, all highly conserved

Bacterial

9-233;

251-531;

  

homologs of the kinase domain

 

thiamin-phosphate

hydroxyethyl-thiazole

  

of this protein are bacterial; it

 

pyro-phosphorylase

kinase

  

appears likely that the A. fulgidus

     

gene is the result of horizontal

     

transfer

  1. The following complete genomes were analyzed. Archaea: Aeropyrum pernix (Ap); Archaeoglobus fulgidus (Af); Methanococcus jannaschii (Mj); Methanobacterium thermoautotrophicum (Mth); Pyrococcus horikoshii (Ph); Bacteria: Aquifex aeolicus (Aa); Borrelia burgdorferi (Bb); Bacillus subtilis (Bs); Chlamydophila pneumoniae (Cp); Deinococcus radiodurans (Dr); Escherichia coli (Ec); Haemophilus influenzae (Hi); Helicobacter pylori (Hp); Mycobacterium tuberculosis (Mt); Mycoplasma pneumoniae (Mp); Rickettsia prowazekii (Rp); Synechocystis sp (Ssp); Thermotoga maritima (Tm); Treponema pallidum (Tp). No IKFs were detected in the genomes that are not shown in the table. Additional species name abbreviations: At, Arabidopsis thaliana; Axy, Acetobacter xylinus; Bn, Brassica napus; Ce, Caenorhabditis elegans; Cvi, Chromatium vinosum; Gg, Gallus gallus; Hs, Homo sapiens; Mm, Mus musculus; Rn, Rattus norvegicus; Sco, Streptomyces coelicolor; Sl, Streptomyces lavendulae.