From: Fueling ab initio folding with marine metagenomics enables structure and function predictions of new protein families

Species distribution and structure-based function annotations for the 27 unknown Pfam families. a Taxonomical distribution of all the 235 genera. Different colors represent different classifications, and the bar corresponding to the outer circle indicates frequency of the corresponding genera in more than 300 families. b Frequency distribution of all the 235 genera in the 27 Pfam families. The vertical axis represents the percentage of species with a specific frequency in 27 Pfam families. c A phylogenetic tree of 27 detected genera occurred in over 6 families. The circle size is proportional to the frequency of the species observed from these samples. d Function distribution for the 27 genera. The GO functions in Biological Process, Cellular Component, and Molecular Function were classified into 6, 7, and 5 sub-categories, where the sub-category is marked in red if the function was detected in the corresponding genera. e Taxonomical distribution of 27 genera. In 27 genera, 3 genera belong to Eukarya, 3 genera belong to virus, and 21 genera belong to bacteria. f Relative abundance distribution of 21 bacterial genera in Tara Oceans dataset. The relative abundance of 27 genera is calculated and horizontally aligned to corresponded genera

