The general mechanism of inference for each of the four methods used by the Proteome Navigator. (a) The gene neighbor (GN) method identifies protein pairs encoded in close proximity across multiple genomes. We see in this example that genes A and B are gene neighbors while A and C are not. (b) The Rosetta Stone (RS) method searches for gene fusion events. We see that the A and B proteins are expressed as separate proteins in one organism. However, in a second organism a sequence exists that represents the fusion of the two proteins. The fusion protein is termed the Rosetta Stone protein as it allows us to infer that the A and B proteins are functionally linked. (c) The construction of phylogenetic profiles (PP) begins with four sequenced genomes, from which the protein sequences have been predicted. The protein sequence, A, within E. coli is compared to that of the proteins coded by the other genomes and homologs are identified. If the genome contains a homolog of A, a 1 is placed in the corresponding phylogenetic profile position, a 0 otherwise. Genes with similar phylogenetic profiles are likely to participate in the same pathway. (d) The gene cluster (GC) or operon method identifies closely spaced genes, and assigns a probability P of observing a particular gap distance (or smaller), as judged by the collective set of inter-gene distances.