A proteome-wide protein interaction map for Campylobacter jejuni
© Parrish et al.; licensee BioMed Central Ltd. 2007
Received: 2 January 2007
Accepted: 5 July 2007
Published: 05 July 2007
Data from large-scale protein interaction screens for humans and model eukaryotes have been invaluable for developing systems-level models of biological processes. Despite this value, only a limited amount of interaction data is available for prokaryotes. Here we report the systematic identification of protein interactions for the bacterium Campylobacter jejuni, a food-borne pathogen and a major cause of gastroenteritis worldwide.
Using high-throughput yeast two-hybrid screens we detected and reproduced 11,687 interactions. The resulting interaction map includes 80% of the predicted C. jejuni NCTC11168 proteins and places a large number of poorly characterized proteins into networks that provide initial clues about their functions. We used the map to identify a number of conserved subnetworks by comparison to protein networks from Escherichia coli and Saccharomyces cerevisiae. We also demonstrate the value of the interactome data for mapping biological pathways by identifying the C. jejuni chemotaxis pathway. Finally, the interaction map also includes a large subnetwork of putative essential genes that may be used to identify potential new antimicrobial drug targets for C. jejuni and related organisms.
The C. jejuni protein interaction map is one of the most comprehensive yet determined for a free-living organism and nearly doubles the binary interactions available for the prokaryotic kingdom. This high level of coverage facilitates pathway mapping and function prediction for a large number of C. jejuni proteins as well as orthologous proteins from other organisms. The broad coverage also facilitates cross-species comparisons for the identification of evolutionarily conserved subnetworks of protein interactions.
A catalog of all the protein interactions that occur in an organism could provide a useful starting point for understanding the functions of proteins and entire biological systems. Several research groups have performed large-scale screens with the goal of identifying all of the protein interactions, or the interactome, for a given organism. One productive approach has been to co-affinity purify (co-AP) members of protein complexes using affinity-tagged bait proteins and then to identify the complex members using mass spectrometry (MS). This approach has been particularly useful for single-cell model organisms like Escherichia coli and Saccharomyces cerevisiae, in which large sets of affinity-tagged proteins can be expressed readily and co-AP/MS can be performed on large quantities of cells [1–6]. A complementary approach that detects binary protein interactions rather than protein complexes is the yeast two-hybrid system . In contrast to the co-AP/MS studies, large-scale yeast two-hybrid screens measure interactions in an artificial setting, the yeast nucleus, with the goal of mapping all of the possible specific binary interactions that may occur in vivo. Large-scale yeast two-hybrid screens have been used to probe the interactomes of a wide range of organisms from viruses to humans (see [8–10] for reviews). The yeast two-hybrid screens and the co-AP/MS studies provide at least a static picture of protein interactions that may occur under one or a defined set of in vivo conditions. The resulting interaction maps can provide a framework for understanding pathways and molecular machines, particularly when combined with other types of functional genomics data, including gene phenotypes and dynamic information such as gene expression, protein expression, and protein localization data.
Very few bacterial species have been analyzed at the proteome level for protein interactions. For example, large-scale systematic determination of binary protein interactions has been described for only one bacterium to date, Helicobacter pylori . That study resulted in interactions covering 46% of the H. pylori proteome (Additional data file 1). Meanwhile, E. coli is the only bacterium for which protein complex purifications have been applied at the proteome scale [1, 6]. Binary protein interactions predicted from these studies include 80% of the E. coli proteome. With the immense number and diversity of different bacterial species that exist, a huge reservoir of prokaryotic protein interactions have yet to be sampled.
Campylobacter jejuni is a Gram-negative food-borne pathogen that is a major cause of gastroenteritis in humans . Infection with C. jejuni has also been associated with the autoimmune peripheral neuropathy known as Guillain Barré syndrome and immunoproliferative small intestinal disease [13–15]. Despite the importance of C. jejuni as a pathogen, much remains to be learned about its biology and mechanisms for causing disease. The functions of over 50% of the 1,654 proteins predicted to be encoded by the C. jejuni NCTC11168 genome are either unknown or poorly characterized, as implied by their unnamed gene status . Clues about the functions of these proteins could come from protein interaction data. Most of the protein interaction data for C. jejuni come from small-scale experiments with individual proteins or from the somewhat less reliable method of predicting interactions based on measurements with orthologous proteins in other organisms. Despite the proven utility of protein interaction data, most of the C. jejuni proteins are not yet known or predicted to be involved in an interaction. Thus, interactome data could significantly aid C. jejuni research. Because co-AP/MS studies would be difficult for this organism we set out to map interactions using the two-hybrid system.
Here we report the results of a proteome-scale systematic screen of C. jejuni protein interactions. Using a comprehensive yeast two-hybrid approach we tested over 89% of the predicted C. jejuni NCTC11168 proteins for interactions and identified thousands of novel protein interactions covering 80% of the proteome. For each interaction we generated a confidence score that reflects its probability of being biologically relevant, resulting in 2,884 interactions with high confidence scores. We demonstrate how these data can be used to map pathways, generate hypotheses about protein function and network evolution, and to identify potential new drug targets. We have assembled all of the interactions from this study into a single comprehensive C. jejuni protein interaction database  that also contains computational predictions  and interolog  predictions based on E. coli and H. pylori protein interactions. The interaction data can be readily accessed and downloaded using the web-based application tool called IM Browser .
Systematic identification of protein interactions for C. jejuniNCTC11168
Summary of array generation and interaction testing
Higher confidence interactions* †
The interaction map includes all of the major protein types and is not significantly enriched for any particular gene classification (Additional data file 2). As expected, however, integral membrane proteins are slightly depleted (Additional data file 3), which was likely due to failure to reach the nucleus or improper folding in the nuclear environment. The high coverage (80% of the predicted proteome) can be attributed in part to the number of proteins tested, to the systematic pooled matrix approach, and to the use of regulated promoters to detect interactions with toxic proteins or proteins that activated the reporters on their own. For example, proteins toxic or inhibitory to yeast were successfully assayed by expressing the fusion proteins with an inducible rather than constitutive promoter . Constitutive expression of inhibitory proteins can result in down regulation of the fusion proteins and loss of the ability to detect interactions . In this study we found that 114 (7%) of the proteins in our array were either toxic or inhibitory to yeast (Additional data file 4). Nevertheless, we were able to detect over 700 interactions that involved these proteins, including the well-known GroES-GroEL interaction.
Data quality and confidence scores
Comparison of C. jejuni, H. pylori, and E. coli protein interaction sets to an E. coli reference set containing 599 low-throughput literature-cited interactions*
Reference set interologs†
Overlap with reference set interologs
Overlap (%) with total interactions detected for each study
Fraction of proteins in each study with orthologs in the reference set
C. jejuni (HC)
H. pylori ‡
E. coli §
E. coli ¶
The C. jejuniprotein interaction network
Several studies have shown that highly interconnected regions of experimentally derived protein interaction maps correspond to biologically relevant protein modules, such as complexes or pathways. Proteins with related functions, for example, tend to be clustered into highly interconnected subnetworks [25, 30, 31]. Moreover, interactions within more highly interconnected regions of protein networks tend to be enriched for true positives [32, 33]. This suggests that clustering is a biological feature of a protein interaction map. The C. jejuni protein network has many groups of highly interconnected proteins, as indicated by its average clustering coefficient (0.10), which is high compared to other large-scale interaction maps (Additional data file 5). The C. jejuni higher confidence set, for example, is more highly clustered than the Drosophila interaction map (average clustering coefficient of 0.05 versus 0.02, respectively) even though the average number of interactions per protein in the two maps is similar. This could be explained by the fact that the C. jejuni map covers much more of the proteome than the Drosophila map. Indeed, among all the maps there is a general trend of increased clustering as the coverage increases (Additional data file 5).
Cross-species protein interaction network conservation
To explore the potential relationships among conserved subnetworks, we used hierarchical clustering to group proteins by their subnetwork memberships (Figure 5b). These clusters support the idea, previously argued by Gavin et al. , that the network is composed of a set of functional 'cores' that interact with interchangeable 'modules' to constitute distinct cellular functions. Both cores and modules appear as groups of proteins with similar profiles of subnetwork membership; however, while core proteins appear in many subnetworks, modules appear in relatively few. Moreover, cores may appear in the presence or absence of multiple modules, whereas modules are generally found only in the presence of a particular core. These data suggest a higher level of organization amongst protein interactions within organism-wide interaction networks. Additionally, hierarchical clustering also reveals that the conserved portion of the C. jejuni protein-protein interaction network generated from the comparison of C. jejuni and E. coli is distinct from that generated by the comparison of C. jejuni and S. cerevisiae. This may reflect key differences in divergence between the prokaryotes C. jejuni and E. coli versus the eukaryote S. cerevisiae.
A framework for protein function predictions and pathway mapping
Examination of proteins in the C. jejuni map that have been assigned a function (for example, based on sequence similarity to characterized proteins) reveals that proteins involved in the same process tend to interact with each other more frequently than expected by chance (Additional data file 10). This is consistent with the idea that interacting proteins in the map often function in the same pathway or protein complex. The C. jejuni interaction map, therefore, can be used to predict the biological role of uncharacterized proteins based on the functions of interacting proteins, as demonstrated for eukaryotic protein networks . An analysis of proteins involved in flagellum biosynthesis provides a useful example. The C. jejuni interaction map includes an interaction between FliS, a putative flagellum assembly export chaperone, and FlaA and FlaB, the flagellin subunits comprising the flagellum. This is consistent with orthologous protein interactions detected in Salmonella typhimurium , and in the solved Aquifex aeolicus co-crystal structure of FliS in complex with a FliC (flagellin) fragment . Unique to our C. jejuni dataset, however, is the additional interaction detected between FliS and the secreted protein FlaC. Despite homology to FlaA and FlaB at the amino and carboxyl termini, FlaC is not a component of the flagellum, but rather may have a role in cell invasion . Experimental data indicate that the flagellar apparatus is required for secretion of FlaC . Our interaction data suggest that FliS may help mediate FlaC export. The map likewise connects 663 other poorly characterized proteins into networks that provide initial clues about their functions (Figure 1).
A network of putative essential genes
The many uncharacterized proteins in the essential protein network are potentially biologically important and may include potential novel drug targets. For example, Box B in Figure 8 highlights a protein of unknown function, Cj0189c, which has interaction partners with five ribosomal proteins. Based on this and the fact that proteins with related functions tend to interact, it is reasonable to hypothesize that Cj0189c may also be involved in ribosome assembly or function. This is potentially significant given that the ribosome and protein synthesis are frequent targets of antibiotics . Box C in Figure 8 highlights the uncharacterized protein Cj0980, which is homologous to the dipeptidase, peptidase D. In E. coli, peptidase D is one of the enzymes that generates cysteine by cleaving cysteinylglycine . In our map, Cj0980 interacts with nine proteins predicted to be essential. One of these proteins, Cj0240c, is a homolog of IscS, a cysteine desulfurase required for the synthesis of all tRNA thiolated nucleosides in E. coli . Interestingly, four additional interactors of Cj0980 are tRNA synthetases. Whether or not their product tRNAs are modified in C. jejuni has not been determined, but this series of interactions suggests a possible pathway or protein complex that mediates the transfer of a thiol group originating from cysteinylglycine to specific tRNAs.
The large-scale interaction studies performed to date have fallen short of complete interactome coverage. The most complete large-scale yeast two-hybrid screens have covered only around 54% of the proteome in Drosophila [22, 25, 56], 46% in H. pylori  and 55% in yeast [57–59], while co-AP/MS studies have reached 80% and 67% of the E. coli and yeast proteomes, respectively [1–6] (Additional data file 1). Complete interactome coverage should include most of the proteome, since most proteins are believed to function at least in part through interactions with other proteins. A major factor contributing to incomplete coverage is the incomplete nature of the high-throughput screens, as indicated by the minimal rate of overlap observed between independent large-scale screens (Additional data file 1) [22, 59]. Thus, despite the usefulness of the data from various interaction mapping efforts, the low interactome coverage is likely to limit efforts to predict protein functions, map pathways, and characterize protein networks. Low coverage also limits the opportunity for cross-validation, which is particularly important for high-throughput datasets because they tend to have high rates of false positives [24, 60].
We have made substantial progress towards defining the C. jejuni interactome. Based on the number of ORFs included in the interaction dataset, we have covered 80% of the proteome, and our higher confidence dataset covers 67%. An expected consequence of performing high-throughput screens, which tend to be subsaturating, is that some interactions that are detectable by two-hybrid assays are missed . We set out to minimize these false negatives by using a highly sensitive two-hybrid system, inducible promoters to detect interactions with toxic proteins and transcriptional activators, and a pooled-matrix mating scheme to maximize the number of interactions sampled. Despite these efforts, some interactions will be missed, especially those that are refractory to standard two-hybrid assays. Detection of these will require other technologies, such as isolation and identification of protein complexes, and assays that target specific classes of proteins, such as membrane proteins [61, 62]. Interaction networks may also be made more complete by using computational approaches to predict missed interactions [34, 35]. In this study we applied a comparative algorithm to align protein networks from C. jejuni to the interactomes of other species to generate further predictions of protein interactions. Like the high throughput experimental data, these predictions provide a guide for directed validation studies.
An unfortunate side effect of large-scale protein interaction datasets is the presence of significant numbers of false positive interactions. We addressed this problem in two ways. First, we retested every interaction in a second independent two-hybrid assay. Second, we calculated probability scores that correlate with the likelihood that an interaction is biologically relevant. One advantage to this confidence scoring system is that it scores interactions rather than proteins and, therefore, does not specifically delete any proteins. Several studies, including ours, have found an inverse correlation between the biological significance of an interaction and the total number of interactions for the two proteins involved; the more interactions that a protein has, the less likely they are to be biological true positives. One approach to increasing the overall confidence of a dataset, therefore, is to delete these 'sticky' proteins. In contrast, it is possible to identify biologically relevant interactions involving these proteins by using a statistical scoring system that weighs multiple attributes according to their correlation with biological significance. With such a scoring system an interaction may be penalized because it involves a sticky protein, but redeemed due to some other attribute. This is the case, for example, in our data with the interactions FliS-FlaC, GroEL-GroES, Ilvl-IlvH, PyrB-PyrC2, and TrxA-TrxB, all of which involve proteins with more than 60 interactions, yet have confidence scores above 0.8, and are likely to be biologically significant.
Another advantage to this scoring system is that it allows user-defined confidence intervals to be chosen based on particular analysis needs. Global analyses, for example, may benefit from using the highest confidence dataset. More focused analyses involving one or few proteins, on the other hand, may tolerate lower confidence interactions because validation experiments can be performed. This reduces the chances of missed interactions. Importantly, some low confidence interactions may be found to be biologically significant by experimental validation or by considering additional information not used in the scoring system. For example, by considering pairs of proteins with known functions, one can find a number of likely true positives with confidence scores below 0.2, including DnaX-DnaN, ExbD1-ExbD3, and FabF-FabG.
Finally, the confidence that we have in any particular interaction can change as new data become available about the two proteins or about the interaction itself. We have shown that the scores we assigned to the C. jejuni two-hybrid data correlate with biological significance such that more of the interactions with higher scores will be biologically significant than those with lower scores, and vice versa. Nevertheless, a fraction of the low confidence interactions are true positives and some of the high confidence interactions are false positives. It is expected that these will be sorted out using new, increasingly accurate confidence scoring systems that are based, for example, on new information as it becomes available. Thus, we have defined the scoring of the C. jejuni two-hybrid data presented here as version 1.0.
Interactome maps such as the one generated in our study begin to provide a tally of the binary protein interactions that can occur within an organism. Although incomplete, the data can provide a framework for understanding dynamic biological processes, such as the C. jejuni chemotaxis response. The map also can be mined for subnetworks of biological interest, such as essential gene networks that suggest candidate drug targets. Comparative analyses of protein interaction maps generated for humans and model eukaryotes have provided insights into the function and evolution of proteins and their regulatory networks. The protein interactions detected for each species also have enabled the prediction of interactions in other species, which is particularly important given the difficulty of obtaining complete coverage in high throughput screens, and the lack of suitable screening systems for many species. The C. jejuni interaction map generated here substantially increases the protein interactions detected thus far for the prokaryotic domain of life. The map should provide a useful starting point for predicting the functions of uncharacterized proteins and for mapping functional pathways in C. jejuni and other prokaryotes.
Materials and methods
Strains and plasmids
The two-hybrid system used here is based on the version originally described by Brent and colleagues . C. jejuni ORFs were cloned into the yeast two-hybrid vector pJZ4-NRT for expression of AD fusions driven by the yeast GAL1 promoter , and pHZ5-NRT for expression of LexA DNA BD fusions driven by the yeast MAL62 promoter . Both vectors contain recombination tags for direct cloning of tagged inserts (see below). Yeast strain RFY231 (MATα trp1Δ::hisG his3 ura3-1 leu2::3LexAop-LEU2) contained the AD plasmids, while Y309 (MATa trp1Δ::hisG his3Δ200 leu2-3 lys2Δ201 ura3-52 mal- pSH18-34(URA3, lacZ)) contained the BD plasmids. The reporter genes include LEU2, facilitating growth on medium lacking leucine, and lacZ, expression of which turns yeast colonies blue when the substrate X-Gal is present.
Generation of yeast two-hybrid arrays for C. jejuni
PCR amplification of over 87% of the predicted ORFs from C. jejuni NCTC11168 genomic DNA was previously described . The amplification products included the 21 bp recombination tags 5RT1 and 3RT1 at their 5' and 3' ends, respectively, which match identical sites flanking the insertion site in the yeast two-hybrid vectors. PCR products were cloned into the vectors via homologous recombination in yeast as described previously . To validate the identity of the insert in each vector, the 5' ends of the inserted PCR products were sequenced. We generated 1,398 BD strains and 1,442 AD strains containing the two-hybrid vectors with inserts, of which 90% have been sequence verified. Most of the ORFs missing from the arrays failed PCR amplification prior to cloning.
High-throughput yeast two-hybrid analysis
We mated BD and AD strains using a two-phase pooling (pooled matrix) strategy as described previously [21, 22]. Briefly, 15 pools of approximately 96 AD strains each were generated, along with one additional pool of 32 strains. Each pool was mated with individual BD strains arrayed on 96-well plates, and the resulting diploids were assayed for reporter activities. Positive BD strains were then mated with each member of the positive AD pool arrayed on 96-well plates to identify the interacting pairs. Reporter activities were scored using a custom program for image analysis  and at least one manual scoring. LacZ scores ranged from 0 (white) to 5 (dark blue) and Leu scores ranged from 0 (no growth) to 3 (heavy growth); combined scores ranged from 0 to 8. Many BDs have some level of background activity due to activation independent of the AD fusion or non-specific interactions. To correct for these we calculated the average interaction score for each BD based on at least 96 interaction assays and subtracted this background from the reporter scores for each of its interactions. Of these corrected scores, only those ≥ 1 were considered initial positives and were retested (see below). A small subset of BD strains (94 total) was also assayed using a library approach as described [21, 22]. Briefly, BD strains were individually mated with a single pool containing almost all of the AD strains (except Cj1718c (leuB) and Cj1546, which activate reporters without a BD). Up to 30 diploids with reporter activity were picked for each BD. Their AD inserts were PCR amplified and restriction digested to identify strains carrying the same clones. Single representatives from each restriction fragment class (RFC) were then sequenced to identify the inserts. Of the 134 interactions detected, 52 (39%) were also identified in the two-phase matrix screen. Combined, 16,104 unique interactions were retested in one-on-one binary mating assays between individual AD and BD strains on 96-well plates. A total of 11,687 interactions proved repeatable (background-corrected combined activity score ≥ 1), including 73% of those from the two-phase matrix screen, 75% of those from the library screen, and 100% of those detected in both screens. The majority of interactions that failed to repeat had been low-scoring (less than 2) in the initial screen. The 11,687 interactions that repeated were combined with 325 non-repeated interactions that had high confidence scores (see below) to create a dataset containing 12,012 interactions, which we named CampyYTH v3.1. This version of the dataset was subsequently used for bioinformatics analysis as indicated. The interaction data can be visualized and downloaded at . The CampyYTH v3.1 data are also listed in Additional data file 13.
Assignment of confidence scores
Confidence scores were determined for each interaction based on methods described by Bader and colleagues [24, 25]. We fit a generalized linear model  using experimental and topological attributes of yeast two-hybrid interactions, including the number of interactions for each protein in a pair and the Leu and lacZ reporter activities Fitting the model required both positive and negative training sets. Because a reference set of known interactions is not available for C. jejuni, we derived a set of positive training data (85 interactions total) by assuming that the conserved interactions (reciprocal best match interologs) in common with either the E. coli low-throughput interaction set , the H. pylori yeast two-hybrid set , or the E. coli protein complex set  are likely to be true positives. We derived a set of likely true negatives (111 total) for the negative training data by considering interactions between proteins whose orthologs in E. coli or H. pylori were separated in the respective interaction maps by greater than the average distance of all pairs (≥ 4). Positive and negative training cases were weighted inversely to the number of interactions in each set. When training sets are weighted this way, a confidence score greater than 0.5 means that available data and features support that a specific interaction has a better than random chance to be a true interaction; this allows 0.5 to be used as the threshold between high and low confidence interactions. Validation using protein features not used in the scoring system support the choice of 0.5 as a threshold for higher confidence interactions (discussed further in Additional data file 14; see also Figure 2c). Of the attributes tested, the numbers of interactions per protein were found to be negative predictors of biologically relevant interactions, while reporter activities were positive predictors. To evaluate the scoring model, we performed a stratified five-fold cross-validation. Cross-validation reported a precision of 91.4% and a recall of 78.9%, which gave us confidence that it is a reasonably well-fitted model. We then used the full sets of positives and negatives in training and obtained our final logistic model. The final model was used to compute confidence scores for 16,104 initial positive interactions prior to retesting. Of these, 3,209 scored higher than 0.5, which we define as the high confidence set. Of the interactions with high confidence scores (> 0.5), 90% corresponded to interactions that repeated when retested, while only 68% of the low confidence interactions repeated. Further discussion and details of the confidence scoring system are available in Additional data file 14.
Evaluating the confidence score model
Main role annotations 'mainrole' were downloaded from . Excluding self-interactions, out of the 3,209 high confidence interactions, 2,599 have 'mainrole' annotations, and 454 share at least one 'mainrole' annotation. We generated 5,000 groups of 2,599 randomly selected interactions that have 'mainrole' annotations and have a confidence score lower than 0.5. The number of pairs in each set that share 'mainrole' annotations was counted. The distribution was plotted in a histogram and compared with the high confidence set (Figure 2b). To examine whether high confidence interactions tend to share more detailed GO  annotations, we grouped interactions into confidence bins so that each bin contains only interactions with scores falling into a specific range. For each interaction, we determined the deepest level of GO biological process annotations shared by the pair of genes, and calculated the average depth of shared biological process for each group. Since GO for C. jejuni NCTC11168 was not available, we used annotations for best match orthologs of C. jejuni RM1221 genes . Figure 2c shows that there is a general pattern of increased depth of shared GO terms for interactions with confidence score higher than 0.5. This fact also suggests that our choice of 0.5 as a high confidence threshold is meaningful.
Assessment of functional enrichments
The frequency of each GO description from the iProClass database , amongst all of the proteins comprising the proteome was determined and compared to their frequency within the CampyYTH v3.1 dataset or the high confidence subset (Additional data file 3). A similar analysis was performed using the functional classifications assigned by the Sanger Institute  (Additional data file 2). We also looked for pairs of GO annotations that were enriched in the interaction data (Additional data file 10). To do this we counted the number of interactions having a specific pair of GO terms. We mapped the annotations to level 5; that is, for a protein with GO annotation A that is at a deeper level than 5, we mapped A to level 5 using 'parent' and 'part of' relationships in the ontologies, and we discarded A if it was above level 5. Self-interactions were excluded from the analysis. We did the same for all GO terms annotated to a protein. To compute the significance of finding specific GO pairs, we generated 2,000 random networks by randomly switching pairs of links while maintaining the degree distribution of the original map, and counted the number of times we found each GO pair in each randomized network. For each GO pair, a p value was computed based on the distribution of the 2,000 counts (assuming normal distribution) and the count in the original yeast two-hybrid map. The p value represents the probability of seeing such a pair in a random network. We listed only pairs with a p value less than 5%.
Comparative network analysis
Additional details are in Additional file 14. Protein-protein interactions from C. jejuni were compared with those from E. coli , H. pylori  and S. cerevisiae from DIP . Corresponding protein sequences were obtained from the following sources: C. jejuni NCTC11168 ; E. coli ; H. pylori ; and S. cerevisiae . We used NetworkBlast to identify significant conserved protein-protein interaction subnetworks . A stand-alone Java version of the program is available at . Briefly, the algorithm takes as input a pair of protein-protein interaction networks, one for each of two species, along with a set of homology relationships between the proteins of the two networks. We constructed the homology relationships from an all-versus-all BLAST of the complete set of protein sequences for each of the two species, taking the top 10 hits with E-value = 10-10. Next, a network alignment graph was created where each node represents a homologous pair of proteins from species 1 and 2 (for example, a1 and a2) and each edge represents a conserved interaction (a1/a2 connects to b1/b2 if the a-b interaction is found in both species; interactions may be either direct (distance 1) or indirect (distance 2), in which a-b is connected through a common neighbor, that is, a-c-b). A greedy search is initiated from each node to identify conserved protein subnetworks, defined as dense subgraphs within the network alignment graph (of maximum size 15 proteins per species). When multiple subnetworks contain protein homologs that overlap by ≥ 50%, only the complex with the highest density was included in the final result. GO annotations  of proteins in each conserved complex were analyzed to identify significant functional enrichments (Additional data file 6). We calculated a hypergeometric p value of enrichment for each GO annotation in the three divisions of the GO hierarchy and constrained the annotations by requiring that at least half of the proteins in a complex ascribe to the enrichment. The most specific annotations with hypergeometric p value < 0.05 in each of the three divisions were then assigned to each complex. A complete list of conserved complexes between C. jejuni and E. coli or S. cerevisiae is available for download at . The significant conserved subnetworks provided predictions of 379 new C. jejuni protein-protein interactions not found in the two-hybrid screens (Additional data file 7). A protein pair (a, b) was predicted to interact directly if: first, both a and b were present in the same significant conserved complex; second, this pair was observed to interact indirectly in C. jejuni; and third, this pair corresponded to a direct interaction in the comparison species' network.
Clustering of conserved subnetworks
Since proteins can belong to more than one complex, we clustered the significant conserved subnetworks by protein membership, in effect 'superclustering' the interactions (Figure 5b). An n × m matrix was constructed, where n is the number of significant subnetworks and m is the number of unique proteins involved in any of the significant subnetworks. Using the open source tool ClustArray , we clustered the proteins hierarchically using the unweighted pair group method with arithmetic mean (UPGMA) and clustered the subnetworks with a combination k-means algorithm followed by UPGMA hierarchical clustering. The number of clusters k = 3 was chosen as the parameter that approximately minimized within-cluster variability and maximized between-cluster variability (data not shown). Identities of complexes and proteins are shown in the high resolution image of the hierarchical clustering in Additional data file 8. Lists of the proteins comprising complexes are available for download at .
Essential gene analysis and network assembly
We generated lists of putative C. jejuni NCTC11168 essential proteins by identifying reciprocal best match orthologs of likely essential proteins from B. subtilis  and E. coli . We removed genes from our putative essential list if viable null mutants have been reported (Dr. B. Wren, personal communication). To examine the relationship between essentiality and centrality in the interaction map, we computed the numbers of essential and non-essential proteins in groups having the same number of interactions (degree) in the higher confidence dataset (interactions with confidence scores > 0.5). The result is shown in Figure 7, where r values in the graphs represent Pearson correlation coefficients between the fractions and the degrees. Figure 7 shows that there is a correlation between degree of proteins and the likelihood of being essential. A similar result was obtained with the entire dataset CampyYTH v3.1 (not shown). Lastly, we computed the fraction of essential and non-essential neighbors of each essential protein and compared this to the fraction for random groups of proteins (of the same size as the set of essential proteins). The results shown in Additional data file 11 indicate that essential genes tend to have more neighbors that are also essential; p values indicate the probability of seeing the real fraction (the red dot) by chance.
Additional data files
The following additional data are available with the online version of this paper. Additional data file 1 is a table summarizing proteome coverage from large-scale interaction screens. Additional data file 2 is a table listing the representation of functional categories amongst the proteins in the CampyYTH v3.1 dataset. Additional data file 3 is a table listing the GO category representation amongst the proteins in CampyYTH v3.1. Additional data file 4 lists C. jejuni genes that were toxic or inhibitory to yeast growth. Additional data file 5 is a table comparing network features across organisms. Additional data file 6 lists conserved subnetworks between C. jejuni and E. coli or C. jejuni and yeast. Additional data file 7 lists predicted C. jejuni protein interactions. Additional data file 8 is a higher resolution version of Figure 5, showing hierarchical clustering of conserved subnetworks. Additional data file 9 is a table listing enriched functions within the cores and modules of Figure 5. Additional data file 10 is a table showing GO enrichment amongst the C. jejuni protein interactions. Additional data file 11 is a figure showing that essential proteins interact with each other more often than expected by chance. Additional data file 12 is a table of C. jejuni interologs predicted from large-scale protein interaction analyses performed for E. coli or H. pylori. Additional data file 13 is an annotated list of all C. jejuni protein interactions in the CampyYTH v3.1 dataset. Additional data file 14 includes supplementary materials and methods.
We thank Thawornchai Limjindaporn, Dima El-Khechen, Keith Gulyas, Meghan Hurt, and Rohinton Tarapore for technical assistance, and Michigan Proteome Consortium members and Janine Maddock for helpful discussions. This work was supported in part by Grant RR18327 from The National Center for Research Resources, a component of The National Institute of Health, and by grant HG001536 from the National Human Genome Research Institute.
- Butland G, Peregrin-Alvarez JM, Li J, Yang W, Yang X, Canadien V, Starostine A, Richards D, Beattie B, Krogan N, et al: Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature. 2005, 433: 531-537. 10.1038/nature03239.PubMedView ArticleGoogle Scholar
- Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, et al: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002, 415: 141-147. 10.1038/415141a.PubMedView ArticleGoogle Scholar
- Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B, et al: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440: 631-636. 10.1038/nature04532.PubMedView ArticleGoogle Scholar
- Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K, et al: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002, 415: 180-183. 10.1038/415180a.PubMedView ArticleGoogle Scholar
- Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, et al: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440: 637-643. 10.1038/nature04670.PubMedView ArticleGoogle Scholar
- Arifuzzaman M, Maeda M, Itoh A, Nishikata K, Takita C, Saito R, Ara T, Nakahigashi K, Huang HC, Hirai A, et al: Large-scale identification of protein-protein interaction of Escherichia coli K-12. Genome Res. 2006, 16: 686-691. 10.1101/gr.4527806.PubMedPubMed CentralView ArticleGoogle Scholar
- Fields S, Song O: A novel genetic system to detect protein-protein interactions. Nature. 1989, 340: 245-246. 10.1038/340245a0.PubMedView ArticleGoogle Scholar
- Fields S: High-throughput two-hybrid analysis. The promise and the peril. Febs J. 2005, 272: 5391-5399. 10.1111/j.1742-4658.2005.04973.x.PubMedView ArticleGoogle Scholar
- Cusick ME, Klitgord N, Vidal M, Hill DE: Interactome: gateway into systems biology. Hum Mol Genet. 2005, 14 (Spec no 2): R171-181. 10.1093/hmg/ddi335.PubMedView ArticleGoogle Scholar
- Parrish JR, Gulyas KD, Finley RL: Yeast two-hybrid contributions to interactome mapping. Curr Opin Biotechnol. 2006, 17: 387-393. 10.1016/j.copbio.2006.06.006.PubMedView ArticleGoogle Scholar
- Rain JC, Selig L, De Reuse H, Battaglia V, Reverdy C, Simon S, Lenzen G, Petel F, Wojcik J, Schachter V, et al: The protein-protein interaction map of Helicobacter pylori. Nature. 2001, 409: 211-215. 10.1038/35051615.PubMedView ArticleGoogle Scholar
- Blaser MJ: Epidemiologic and clinical features of Campylobacter jejuni infections. J Infect Dis. 1997, 176 (Suppl 2): S103-105.PubMedView ArticleGoogle Scholar
- Godschalk PC, Heikema AP, Gilbert M, Komagamine T, Ang CW, Glerum J, Brochu D, Li J, Yuki N, Jacobs BC, et al: The crucial role of Campylobacter jejuni genes in anti-ganglioside antibody induction in Guillain-Barre syndrome. J Clin Invest. 2004, 114: 1659-1665. 10.1172/JCI200415707.PubMedPubMed CentralView ArticleGoogle Scholar
- Nachamkin I, Allos BM, Ho TW: Campylobacter jejuni infection and the association with Guillain-Barré Syndrome. Campylobacter. Edited by: Nachamkin I, Blaser MJ. 2000, Washington, DC: ASM Press, 155-175. 2Google Scholar
- Lecuit M, Abachin E, Martin A, Poyart C, Pochart P, Suarez F, Bengoufa D, Feuillard J, Lavergne A, Gordon JI, et al: Immunoproliferative small intestinal disease associated with Campylobacter jejuni. N Engl J Med. 2004, 350: 239-248. 10.1056/NEJMoa031887.PubMedView ArticleGoogle Scholar
- Campylobacter Resource Facility. [http://www.lshtm.ac.uk/pmbu/crf/updated_embl.htm]
- Finley Lab. [http://proteome.wayne.edu]
- Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D: Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol. 2004, 5: R35-10.1186/gb-2004-5-5-r35.PubMedPubMed CentralView ArticleGoogle Scholar
- Walhout AJ, Boulton SJ, Vidal M: Yeast two-hybrid systems and protein interaction mapping projects for yeast and worm. Yeast. 2000, 17: 88-94. 10.1002/1097-0061(20000630)17:2<88::AID-YEA20>3.0.CO;2-Y.PubMedPubMed CentralView ArticleGoogle Scholar
- Pacifico S, Liu G, Guest S, Parrish JR, Fotouhi F, Finley RL: A database and tool, IM Browser, for exploring and integrating emerging gene and protein interaction data for Drosophila. BMC Bioinformatics. 2006, 7: 195-10.1186/1471-2105-7-195.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhong J, Zhang H, Stanyon CA, Tromp G, Finley RL: A strategy for constructing large protein interaction maps using the yeast two-hybrid system: regulated expression arrays and two-phase mating. Genome Res. 2003, 13: 2691-2699. 10.1101/gr.1134603.PubMedPubMed CentralView ArticleGoogle Scholar
- Stanyon CA, Liu G, Mangiola BA, Patel N, Giot L, Kuang B, Zhang H, Zhong J, Finley RL: A Drosophila protein-interaction map centered on cell-cycle regulators. Genome Biol. 2004, 5: R96-10.1186/gb-2004-5-12-r96.PubMedPubMed CentralView ArticleGoogle Scholar
- Finley RL, Zhang H, Zhong J, Stanyon CA: Regulated expression of proteins in yeast using the MAL61-62 promoter and a mating scheme to increase dynamic range. Gene. 2002, 285: 49-57. 10.1016/S0378-1119(02)00420-1.PubMedView ArticleGoogle Scholar
- Bader JS, Chaudhuri A, Rothberg JM, Chant J: Gaining confidence in high-throughput protein interaction networks. Nat Biotechnol. 2004, 22: 78-85. 10.1038/nbt924.PubMedView ArticleGoogle Scholar
- Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao YL, Ooi CE, Godwin B, Vitols E, et al: A protein interaction map of Drosophila melanogaster. Science. 2003, 302: 1727-1736. 10.1126/science.1090289.PubMedView ArticleGoogle Scholar
- Parkhill J, Wren BW, Mungall K, Ketley JM, Churcher C, Basham D, Chillingworth T, Davies RM, Feltwell T, Holroyd S, et al: The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences. Nature. 2000, 403: 665-668. 10.1038/35001088.PubMedView ArticleGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25: 25-29. 10.1038/75556.PubMedPubMed CentralView ArticleGoogle Scholar
- Xenarios I, Rice DW, Salwinski L, Baron MK, Marcotte EM, Eisenberg D: DIP: the database of interacting proteins. Nucleic Acids Res. 2000, 28: 289-291. 10.1093/nar/28.1.289.PubMedPubMed CentralView ArticleGoogle Scholar
- Tanaka R, Yi TM, Doyle J: Some protein interaction data do not exhibit power law statistics. FEBS Lett. 2005, 579: 5140-5144. 10.1016/j.febslet.2005.08.024.PubMedView ArticleGoogle Scholar
- Schwikowski B, Uetz P, Fields S: A network of protein-protein interactions in yeast. Nat Biotechnol. 2000, 18: 1257-1261. 10.1038/82360.PubMedView ArticleGoogle Scholar
- Li S, Armstrong CM, Bertin N, Ge H, Milstein S, Boxem M, Vidalain PO, Han JD, Chesneau A, Hao T, et al: A map of the interactome network of the metazoan C. elegans. Science. 2004, 303: 540-543. 10.1126/science.1091403.PubMedPubMed CentralView ArticleGoogle Scholar
- Saito R, Suzuki H, Hayashizaki Y: Interaction generality, a measurement to assess the reliability of a protein-protein interaction. Nucleic Acids Res. 2002, 30: 1163-1168. 10.1093/nar/30.5.1163.PubMedPubMed CentralView ArticleGoogle Scholar
- Goldberg DS, Roth FP: Assessing experimentally derived interactions in a small world. Proc Natl Acad Sci USA. 2003, 100: 4372-4376. 10.1073/pnas.0735871100.PubMedPubMed CentralView ArticleGoogle Scholar
- Sharan R, Suthram S, Kelley RM, Kuhn T, McCuine S, Uetz P, Sittler T, Karp RM, Ideker T: Conserved patterns of protein interaction in multiple species. Proc Natl Acad Sci USA. 2005, 102: 1974-1979. 10.1073/pnas.0409522102.PubMedPubMed CentralView ArticleGoogle Scholar
- Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A, Snyder M, Greenblatt JF, Gerstein M: A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science. 2003, 302: 449-453. 10.1126/science.1087361.PubMedView ArticleGoogle Scholar
- Ozin AJ, Claret L, Auvray F, Hughes C: The FliS chaperone selectively binds the disordered flagellin C-terminal D0 domain central to polymerisation. FEMS Microbiol Lett. 2003, 219: 219-224. 10.1016/S0378-1097(02)01208-9.PubMedView ArticleGoogle Scholar
- Evdokimov AG, Phan J, Tropea JE, Routzahn KM, Peters HK, Pokross M, Waugh DS: Similar modes of polypeptide recognition by export chaperones in flagellar biosynthesis and type III secretion. Nat Struct Biol. 2003, 10: 789-793. 10.1038/nsb982.PubMedView ArticleGoogle Scholar
- Song YC, Jin S, Louie H, Ng D, Lau R, Zhang Y, Weerasekera R, Al Rashid S, Ward LA, Der SD, et al: FlaC, a protein of Campylobacter jejuni TGH9011 (ATCC43431) secreted through the flagellar apparatus, binds epithelial cells and influences cell invasion. Mol Microbiol. 2004, 53: 541-553. 10.1111/j.1365-2958.2004.04175.x.PubMedView ArticleGoogle Scholar
- Marchant J, Wren B, Ketley J: Exploiting genome sequence: predictions for mechanisms of Campylobacter chemotaxis. Trends Microbiol. 2002, 10: 155-159. 10.1016/S0966-842X(02)02323-5.PubMedView ArticleGoogle Scholar
- Welch M, Oosawa K, Aizawa S, Eisenbach M: Phosphorylation-dependent binding of a signal molecule to the flagellar switch of bacteria. Proc Natl Acad Sci. 1993, 90: 8787-8791. 10.1073/pnas.90.19.8787.PubMedPubMed CentralView ArticleGoogle Scholar
- Sanders DA, Gillece-Castro BL, Stock AM, Burlingame AL, Koshland DE: Identification of the site of phosphorylation of the chemotaxis response regulator protein, CheY. J Biol Chem. 1989, 264: 21770-21778.PubMedGoogle Scholar
- Hendrixson DR, Akerley BJ, DiRita VJ: Transposon mutagenesis of Campylobacter jejuni identifies a bipartite energy taxis system required for motility. Mol Microbiol. 2001, 40: 214-224. 10.1046/j.1365-2958.2001.02376.x.PubMedView ArticleGoogle Scholar
- Wojcik J, Boneca IG, Legrain P: Prediction, assessment and validation of protein interaction maps in bacteria. J Mol Biol. 2002, 323: 763-770. 10.1016/S0022-2836(02)01009-4.PubMedView ArticleGoogle Scholar
- Camilli A, Bassler BL: Bacterial small-molecule signaling pathways. Science. 2006, 311: 1113-1116. 10.1126/science.1121357.PubMedPubMed CentralView ArticleGoogle Scholar
- Gandhi TK, Zhong J, Mathivanan S, Karthick L, Chandrika KN, Mohan SS, Sharma S, Pinkert S, Nagaraju S, Periaswamy B, et al: Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets. Nat Genet. 2006, 38: 285-293. 10.1038/ng1747.PubMedView ArticleGoogle Scholar
- Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hon GC, Myers CL, Parsons A, Friesen H, Oughtred R, Tong A, et al: Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J Biol. 2006, 5: 11-10.1186/jbiol36.PubMedPubMed CentralView ArticleGoogle Scholar
- Jeong H, Mason SP, Barabasi AL, Oltvai ZN: Lethality and centrality in protein networks. Nature. 2001, 411: 41-42. 10.1038/35075138.PubMedView ArticleGoogle Scholar
- Meyer RR, Laine PS: The single-stranded DNA-binding protein of Escherichia coli. Microbiol Rev. 1990, 54: 342-380.PubMedPubMed CentralGoogle Scholar
- Lovett ST, Kolodner RD: Identification and purification of a single-stranded-DNA-specific exonuclease encoded by the recJ gene of Escherichia coli. Proc Natl Acad Sci USA. 1989, 86: 2627-2631. 10.1073/pnas.86.8.2627.PubMedPubMed CentralView ArticleGoogle Scholar
- Lovett ST, Clark AJ: Genetic analysis of the recJ gene of Escherichia coli K-12. J Bacteriol. 1984, 157: 190-196.PubMedPubMed CentralGoogle Scholar
- Burdett V, Baitinger C, Viswanathan M, Lovett ST, Modrich P: In vivo requirement for RecJ, ExoVII, ExoI, and ExoX in methyl-directed mismatch repair. Proc Natl Acad Sci USA. 2001, 98: 6765-6770. 10.1073/pnas.121183298.PubMedPubMed CentralView ArticleGoogle Scholar
- Han ES, Cooper DL, Persky NS, Sutera VA, Whitaker RD, Montello ML, Lovett ST: RecJ exonuclease: substrates, products and interaction with SSB. Nucleic Acids Res. 2006, 34: 1084-1091. 10.1093/nar/gkj503.PubMedPubMed CentralView ArticleGoogle Scholar
- Poehlsgaard J, Douthwaite S: The bacterial ribosome as a target for antibiotics. Nat Rev Microbiol. 2005, 3: 870-881. 10.1038/nrmicro1265.PubMedView ArticleGoogle Scholar
- Suzuki H, Kamatani S, Kim ES, Kumagai H: Aminopeptidases A, B, and N and dipeptidase D are the four cysteinylglycinases of Escherichia coli K-12. J Bacteriol. 2001, 183: 1489-1490. 10.1128/JB.183.4.1489-1490.2001.PubMedPubMed CentralView ArticleGoogle Scholar
- Lauhon CT: Requirement for IscS in biosynthesis of all thionucleosides in Escherichia coli. J Bacteriol. 2002, 184: 6820-6829. 10.1128/JB.184.24.6820-6829.2002.PubMedPubMed CentralView ArticleGoogle Scholar
- Formstecher E, Aresta S, Collura V, Hamburger A, Meil A, Trehin A, Reverdy C, Betin V, Maire S, Brun C, et al: Protein interaction mapping: a Drosophila case study. Genome Res. 2005, 15: 376-384. 10.1101/gr.2659105.PubMedPubMed CentralView ArticleGoogle Scholar
- Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon D, Narayan V, Srinivasan M, Pochart P, et al: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature. 2000, 403: 623-627. 10.1038/35001009.PubMedView ArticleGoogle Scholar
- Ito T, Tashiro K, Muta S, Ozawa R, Chiba T, Nishizawa M, Yamamoto K, Kuhara S, Sakaki Y: Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proc Natl Acad Sci USA. 2000, 97: 1143-1147. 10.1073/pnas.97.3.1143.PubMedPubMed CentralView ArticleGoogle Scholar
- Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA. 2001, 98: 4569-4574. 10.1073/pnas.061034498.PubMedPubMed CentralView ArticleGoogle Scholar
- von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P: Comparative assessment of large-scale data sets of protein-protein interactions. Nature. 2002, 417: 399-403. 10.1038/nature750.PubMedView ArticleGoogle Scholar
- Karimova G, Pidoux J, Ullmann A, Ladant D: A bacterial two-hybrid system based on a reconstituted signal transduction pathway. Proc Natl Acad Sci USA. 1998, 95: 5752-5756. 10.1073/pnas.95.10.5752.PubMedPubMed CentralView ArticleGoogle Scholar
- Johnsson N, Varshavsky A: Split ubiquitin as a sensor of protein interactions in vivo. Proc Natl Acad Sci USA. 1994, 91: 10340-10344. 10.1073/pnas.91.22.10340.PubMedPubMed CentralView ArticleGoogle Scholar
- Gyuris J, Golemis E, Chertkov H, Brent R: Cdi1, a human G1 and S phase protein phosphatase that associates with Cdk2. Cell. 1993, 75: 791-803. 10.1016/0092-8674(93)90498-F.PubMedView ArticleGoogle Scholar
- Parrish JR, Limjindaporn T, Hines JA, Liu J, Liu G, Finley RL: High-throughput cloning of Campylobacter jejuni ORfs by in vivo recombination in Escherichia coli. J Proteome Res. 2004, 3: 582-586. 10.1021/pr0341134.PubMedView ArticleGoogle Scholar
- Jafari-Khouzani K, Soltanian-Zadeh H, Fotouhi F, Parrish JR, Finley RL: Automated segmentation and classification of high throughput yeast assay spots. Trans Med Imaging. 2007.Google Scholar
- McCullagh P, Nelder JA: Generalized Linear Models. 1998, Boca Raton, FL: Chapman and Hall/CRC, 2Google Scholar
- Welcome Trust Sanger Institute Campylobacter jejuni. [http://www.sanger.ac.uk/Projects/C_jejuni/]
- Fouts DE, Mongodin EF, Mandrell RE, Miller WG, Rasko DA, Ravel J, Brinkac LM, DeBoy RT, Parker CT, Daugherty SC, et al: Major structural differences and novel potential virulence mechanisms from the genomes of multiple campylobacter species. PLoS Biol. 2005, 3: e15-10.1371/journal.pbio.0030015.PubMedPubMed CentralView ArticleGoogle Scholar
- Wu CH, Huang H, Nikolskaya A, Hu Z, Barker WC: The iProClass integrated database for protein functional analysis. Comput Biol Chem. 2004, 28: 87-96. 10.1016/j.compbiolchem.2003.10.003.PubMedView ArticleGoogle Scholar
- Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, et al: The complete genome sequence of Escherichia coli K-12. Science. 1997, 277: 1453-1474. 10.1126/science.277.5331.1453.PubMedView ArticleGoogle Scholar
- Wu CH, Huang H, Arminski L, Castro-Alvear J, Chen Y, Hu ZZ, Ledley RS, Lewis KC, Mewes HW, Orcutt BC, et al: The Protein Information Resource: an integrated public resource of functional annotation of proteins. Nucleic Acids Res. 2002, 30: 35-37. 10.1093/nar/30.1.35.PubMedPubMed CentralView ArticleGoogle Scholar
- Christie KR, Weng S, Balakrishnan R, Costanzo MC, Dolinski K, Dwight SS, Engel SR, Feierbach B, Fisk DG, Hirschman JE, et al: Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms. Nucleic Acids Res. 2004, D311-314. 10.1093/nar/gkh033. 32 Database
- Ideker Lab. [http://chianti.ucsd.edu/]
- Knudsen S, Workman C, Sicheritz-Ponten T, Friis C: GenePublisher: Automated analysis of DNA microarray data. Nucleic Acids Res. 2003, 31: 3471-3476. 10.1093/nar/gkg629.PubMedPubMed CentralView ArticleGoogle Scholar
- Kobayashi K, Ehrlich SD, Albertini A, Amati G, Andersen KK, Arnaud M, Asai K, Ashikaga S, Aymerich S, Bessieres P, et al: Essential Bacillus subtilis genes. Proc Natl Acad Sci USA. 2003, 100: 4678-4683. 10.1073/pnas.0730515100.PubMedPubMed CentralView ArticleGoogle Scholar
- Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H: Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol. 2006, 2: 2006.0008-10.1038/msb4100050.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.