Skip to main content


Table 1 KOGs and TWOGs with unexpected phyletic patterns (examples)

From: A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes

KOG/TWOG number Phyletic pattern* (Predicted) structure and function Prokaryotic homologs Comments
TWOG0892 ---H--E Discoidin domain protein, potential regulator of proteasome activity Detected in a few phylogenetically scattered bacteria, no COG so far [69]  
TWOG0263 A-----E ATP/ADP translocase ATP/ADP translocases of chlamydia, rickettsia, Xylella fastidiosa ATP/ADP translocase is a hallmark of intracellular parasites and symbionts, which allows them to scavenge ATP from the host cell; chloroplast protein in plants. Could be acquired by plants and microsporidia via independent HGT from bacteria. [58]
TWOG0689 ---HY-- Uncharacterized protein essential for propionate metabolism PrpD protein of several bacteria and archaea (COG2079) The yeast and human (and the orthologs from other vertebrates) proteins show the greatest similarity to different subsets of bacterial orthologs, which might suggest independent HGT events.
TWOG0871 ---H-P- Uncharacterized conserved protein, probably enzyme COG4336, sporadic representation in several bacterial lineages The human (and mouse) protein has an additional domain conserved in the archaeon Pyrococcus. Human and S. pombe proteins are most similar to different subsets of bacterial homologs, which suggests the possibility of independent HGT events.
TWOG0788 A----P- Urease Ureases of many bacterial species Highly conserved enzyme present in plants and many fungi but not S. cerevisiae. Plant and fungal ureases have a common domain architecture distinct from that of bacterial orthologs, which suggests monophyletic origin. Might have evolved via early HGT from bacteria (proto-mitochondria?) with subsequent loss in animals and some fungi.
4751 A--H--E Recombination repair protein BRCA2, contains varying number of BRCA2 repeats None Although sequence conservation is limited to the BRC repeats [101] the number of which varies substantially, statistical significance of the observed sequence similarity and the absence of other homologs suggests that the proteins in this KOG are true orthologs. Apparent orthologs of BRCA2 are detectable also in other species from the taxa represented in the KOGs (mosquito Anopheles gambiae, fungus Ustilago maydis) [102] and in early-branching eukaryotes (Leishmania, Trypanosoma; E.V.K., unpublished work), suggesting that evolution of BRCA2 involved multiple gene losses
4597 A--H--E TATA-binding protein 1-interacting protein None Probable multiple gene losses
4486 A--H--E 3-methyl-adenine DNA glycosylase Orthologs in many bacteria (COG2094) The plant protein and those from mammals and microsporidia show the greatest similarity to different subsets of bacterial orthologs. Evolution might have included a combination of gene loss and independent HGT events
1594 A-D-Y-- Predicted epimerase related to aldose 1-epimerase Bacterial orthologs, primarily proteobacteria (COG0676) Eukaryotic proteins are more closely related to each other than to bacterial orthologs, indicating monophyletic origin. Function remains unknown; might be involved in a distinct and still uncharacterized pathway of polysaccharide biosynthesis. LSE in Arabidopsis (seven paralogs).
4141 ---HYPE Rad52/22, protein involved in double-strand break repair None Probable gene loss in plants, insects and nematodes
4528 -CDH--E Uncharacterized predicted enzyme, possibly a polynucleotide kinase (structure of the ortholog from the bacterium Thermotoga maritima has been determined - pdb code 1j5u) Conserved in all archaea and several bacteria (COG1371) Context analysis of archaeal and bacterial genomes suggests functional interaction between proteins of KOG5324 and KOG4246, RNA 3'-terminal phosphate cyclase (KOG4398, COG0430), and tRNA/rRNA cytosine C5-methylase (KOG1299/COG0144) ([103] and E.V.K., unpublished observations). Taken together, the observations appear to implicate KOG5324 and KOG4246 in a still uncharacterized pathway of rRNA and/or tRNA processing and modification. Conservation of these proteins in archaea and early-branching eukaryotes suggests lineage-specific gene loss in plants and fungi.
3833 -CDH--E Uncharacterized predicted enzyme, possibly a polynuclotide phosphatase Conserved in all archaea and several bacteria (COG1690) See comment for KOG5324
  1. *Abbreviations: A, thale cress A. thaliana; C, nematode C. elegans; D, fruit fly D. melanogaster; E, microsporidian Encephalitozoon cuniculi; H, Homo sapiens; S, budding yeast S. cerevisiae; P, fission yeast S. pombe; a letter indicates the presence of the respective species in the given KOG and a dash indicates its absence.