Fig. 4From: Terminating contamination: large-scale search identifies more than 2,000,000 contaminated entries in GenBankMultiple sequence alignment of 31 spurious bacterial proteins encoded on short contaminated contigs. Shown here are 31 out of 185 spurious proteins from bacterial genomes. A majority of the sequences are 100 % identical. The only differing residues are highlighted in white. This highly conserved “protein” is conserved on across different bacterial phyla, suggesting it is likely a contaminant that has been erroneously translated as part of automated annotation procedures. The respective short contigs (< 1 kb) encoding these spurious proteins align with high sequence identity and coverage to the Ovis aries genomeBack to article page