Protein set | PID (% protein identity) thresholds |
---|
100% | 99% | 95% | 90% | 85% | 80% |
---|
nr | 9 | 93 | 402 | 960 | 2020 | 3653 |
env_nr | 1 | 1 | 11 | 33 | 210 | 661 |
M5nr | 1 | 2 | 96 | 532 | 1764 | 4073 |
UniProt/TrEMBL | 2 | 8 | 83 | 415 | 1169 | 2468 |
Hess | 839 | 1553 | 5404 | 10,947 | 17,597 | 24,151 |
RUG1 | 116 | 665 | 3058 | 7173 | 12,489 | 18,590 |
RUG2 | 338 | 2028 | 8243 | 17,342 | 26,825 | 35,159 |
Total unique | 1286 | 4172 | 15,012 | 28,561 | 40,458 | 49,323 |
- Counts of hits at distinct minimum percentages of protein identity (PID) thresholds (100%, 99%, 95%, 90%, 85% and 80%) of all 68,850 carbohydrate active proteins from African MAGs against nr, env_nr, M5nr and UniProt/TrEMBL databases and the Hess et al. (Hess), original RUG (RUG1) and RUG 2.0 (RUG2) predicted proteins. The total unique number represents the number of proteins from the African MAGs that have a hit in at least one database