Skip to main content

Table 4 Mean percentage amino acid identity of OMZ 50-m reads with top matches to distinct reference databases (GOS, KEGG, NCBI-nr) and with ribosomal proteins removed

From: Community transcriptomics reveals universal patterns of protein sequence conservation in natural microbial communities

   

Percentage identity to reference genes present inb

 

Databasea

Data

DNA+RNAc

DNA onlyd

RNA only

Alle

All data

      
 

GOS

DNA

89.3

82.1

NA

82.8

 

GOS

RNA

90.8

NA

87.5

89.3

 

KEGG

DNA

67.8

58.3

NA

59.7

 

KEGG

RNA

71.0

NA

69.4

69.6

 

NR

DNA

71.0

59.8

NA

60.8

 

NR

RNA

73.8

NA

72.2

72.7

Without ribosomal proteinsf

      
 

NR

DNA

70.7

59.6

NA

60.6

 

NR

RNA

73.6

NA

71.9

72.5

  1. aBLAST database against which reads were compared. bMean percentage identity across all genes identified via BLASTX against NCBI-nr (HSP alignment regions only; bit score cutoff = 50). cGenes present in both DNA and RNA datasets, that is, 'expressed' genes. dGenes present only in the DNA dataset, that is, 'non-expressed' genes. eGenes shared between datasets (in DNA + RNA) plus genes unique to a dataset. fRibosome-associated proteins removed manually from datasets. BATS, Bermuda Atlantic Time Series; BLAST, Basic Local Alignment Search Tool; GOS, Global Ocean Sampling; HOT, Hawaii Ocean Time Series; HSP, high-scoring segment pair; KEGG, Kyoto Encyclopedia of Genes and Genomes; NA, not applicable; NR, National Center for Biotechnology Information non-redundant protein database (NCBI-nr); OMZ, oxygen minimum zone.