Skip to main content

Table 1 Summary of data types

From: Consistent probabilistic outputs for protein function prediction

Data type Description BP CC MF
Phenotype     
   MGI Mammalian phenotype ontology terms (33) 1,994 2,157 1,898
   OMIM Diseases (2,488) associated with human homologs 998 1,166 978
Phylogenetic profile     
   Inparanoid Orthologs across 21 species 6,131 7,092 6,556
   Biomart Orthologs across 18 species 6,269 7,242 6,695
Protein domain     
   Interpro Functional sites and domains 7,131 8,027 7,603
   PfamA Protein domains 6,790 7,648 7,239
Protein-protein interaction     
   PPI Transferred via orthology from human (OPHID) 3,273 3,690 3,509
Gene expression data     
   Su et al. [9] Oligonucleotide arrays (55 tissues) 6,555 7,587 7,029
   Zhang et al. [7] Affymetrix arrays (61 tissues) 5,097 5,716 5,447
   SAGE Tag counts from SAGE library (99% cutoff) 6,323 7,231 6,753
Total   7,968 9,005 8,427
  1. The table lists the ten data types from [1], along with the number of proteins that are annotated with at least one term of each ontology and for which that data type is available. BP, biological process; CC, cellular component; MF, molecular function.