KEGG spider: interpretation of genomics data in the context of the global gene metabolic network
© Antonov et al.; licensee BioMed Central Ltd. 2008
Received: 7 August 2008
Accepted: 18 December 2008
Published: 18 December 2008
KEGG spider is a web-based tool for interpretation of experimentally derived gene lists in order to gain understanding of metabolism variations at a genomic level. KEGG spider implements a 'pathway-free' framework that overcomes a major bottleneck of enrichment analyses: it provides global models uniting genes from different metabolic pathways. Analyzing a number of experimentally derived gene lists, we demonstrate that KEGG spider provides deeper insights into metabolism variations in comparison to existing methods.
In the post-genomic era the targets of many experimental studies are complex cell disorders [1–6]. A standard experimental strategy is to compare the genetic/proteomics signatures of cells in normal and anomalous states. As a result, a set of genes with differential activity is delivered. In the next step, the interpretation of identified genes in a model context is required. A widely accepted strategy is to infer biological processes that are most relevant to the analyzed gene list. The inference is based on prior knowledge of individual gene properties, such as gene biological functions or interactions. This common approach is usually referred to as enrichment analysis [7–16].
The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for the networks of genes and metabolic compounds. The major component of KEGG is the PATHWAY database, which consists of graphical diagrams of biochemical pathways, including most of the known metabolic pathways. Several available public tools, such as GenMAPP/MAPPfinder , PathwayProcessor, and PathwayMiner , make use of standard enrichment analysis to find overrepresented global pathways within a gene list. However, for statistical evaluation these tools use only information about gene pathway membership, while information about pathway topology is largely discarded. Additionally, several tools provide visualizations of pathways reported to be enriched [19–21]. Some tools provide visualizations of a gene list in the context of the global metabolic network [22, 23], providing, however, no quantitative or statistical analyses. Visual analyses of the graphical representation of the genes on the global metabolic network give only an intuitive feeling that genes are related. Taking into account the density of metabolic networks, one must not underestimate the value of a statistical treatment. Even for randomly generated gene lists, it is possible to connect many of the genes into a metabolic subnetwork through one or two intermediate partners. A graphical representation may have low scientific value without providing a quantitative estimate of the model quality.
More complex statistical methods have been proposed to take pathway topology into account by developing specialized scoring functions. For example, in the ScorePAGE method the distance between genes within the metabolic pathway is included into the scoring function . In this case, the impact of a pair of genes is weighted with respect to the distance between genes within the metabolic pathway. Another recently proposed procedure (impact analyses)  exploits the hierarchical structure of signaling pathways and weights the impact of genes with respect to their position in the pathway hierarchy. Genes at the top of the signaling cascade receive higher impact in comparison to downstream genes.
We propose a novel statistical approach for the analysis of gene lists in the context of gene metabolic pathways that uses network topology to make knowledge inference. Our approach does not evaluate each individual KEGG metabolic pathway separately, but uses a global gene metabolic network that integrates all KEGG metabolic pathways together. The input gene list is translated into a network model, e.g. edges connect genes that most probably affect the state of each other. We also proposed a robust statistical treatment of the inferred network. As an output, our procedure provides a graphical model as well as statistical significance of the inferred network computed by a Monte-Carlo simulation procedure. We show on several real data sets that our approach provides deeper insight into variations of metabolic pathways covered by the given gene list in comparison to currently available methods.
Results and discussion
We assume that the closer the genes on the global gene metabolic network, the greater the probability that the change in the state of one gene will affect the state of the other. In the considered illustrative example in Figure 1, ASS1 and ASL are both associated with L-argininosuccinate. Thus, the change in the state of ASS1 (for example, overexpression) most probably affects the amount of L-argininosuccinate in the cell (Figure 1). There are probably many ways the cell can handle extra amounts of L-argininosuccinate. One of them is to increase the efficiency of its utilization through possible metabolic reactions. The cell response can be the increased level of ASL expression. The ASL overexpression will speed up L-argininosuccinate transformation into fumarate and arginine. Thus, even if two genes are not directly involved in regulatory relationships, but catalyze close reactions on the global network, they can affect the state of each other through auto-regulatory mechanisms switched up by abnormal amounts of common metabolites.
KEGG spider  is a freely available web-based tool that implements a global metabolic network framework for the interpretation of gene lists. It has a simple interface: as input it accepts several types of gene or protein identifiers. For example, for the human genome, KEGG spider supports identifiers from 'Entrez Gene', 'UniProt/Swiss-Prot', 'Gene Symbol' [27, 28], 'UniGene' , Ensembl' , 'RefSeq Protein ID', 'RefSeq Transcript ID' , and'Affymetrix probe codes' . As output, the user gets a report on the statistical significance of the inferred network models (D1, D2,..), as well as a catalog of enriched KEGG pathways and Gene Ontology terms. For each model (D1, D2,..), a link is provided to obtain a graphical visualization. The visualization is performed by the Medusa package . In addition, the user can highlight genes from the model according to KEGG canonical pathways. The inferred network models can be downloaded as a text file and used with freely available packages for network analyses and visualization [32, 33].
Here, we present several examples of analysis of published experimental data by KEGG spider. To illustrate the advantages experimental researchers would get by using KEGG spider in comparison to commonly used pathway enrichment analyses, we provide a comparison between KEGG spider and GENECODIS , a tool recently published in Genome Biology that implements a possibility to perform enrichment analysis of KEGG pathways. The choice of GENECODIS was casual, as the results of enrichment analyses of KEGG pathways by other tools would be similar.
We also provide a comparison (Additional data file 1) of KEGG spider to KEGG atlas . KEGG atlas is a web tool that provides visualization of a gene list (converted into KEGG KO identifiers) in the context of the global metabolic network. As has been discussed above, KEGG atlas provides no quantitative or statistical analyses and, thus, supplies no criteria for the evaluation of the quality of provided graphical output. As demonstrated, the output of KEGG atlas for a random gene list looks similar to the experimentally derived gene lists.
Identification of genes commonly up- or downregulated in diffuse-type gastric cancers
KEGG metabolic pathways enriched in the list of 150 genes (28 genes map to KEGG metabolic pathways) commonly up- or down-regulated in diffuse-type gastric cancers  (reported by GENECODIS)
Number of genes
P-value (not corrected for multiple testing)
(KEGG) Metabolism of xenobiotics by cytochrome P450
(KEGG) Bile acid biosynthesis
Therefore, in comparison to available analytical procedures, KEGG spider enhances our understanding of metabolism variation in gastric cancers. First, it demonstrates that deregulated genes do not split into independent groups (pathways) as may be concluded from standard enrichment analyses: almost all 24 (out of 28) genes form a non-interrupted (a maximum of one missing gene is allowed) network. Second, it provides not only information that 24 genes are mapped close to each other on the global metabolic network but also estimates the confidence of this event: the p-value reflects the probability of getting a non-interruptedly connected network that covers at least the same number of genes for a randomly sampled list of 28 genes (only genes mapped to KEGG metabolic pathways are used to generate the random lists).
Proteomic analysis of livers of patients with primary hepatolithiasis
Primary hepatolithiasis or intrahepatic calculi, which is characterized by the formation of gallstones in the intrahepatic bile duct, is an intractable liver disease and suspected to be one of the causes of cholangiocellular carcinoma. To obtain an insight into the disease, the proteomic analysis of liver tissue specimens was done (affected and unaffected hepatic segments from patients with primary hepatolithiasis) . For the specimens from the unaffected segments, 83 unique proteins were reported. For the specimens from the affected segments, 74 unique proteins were reported. Consequently, 12 up-regulated proteins and 21 down-regulated proteins were identified in affected versus unaffected hepatic segments.
KEGG metabolic pathways enriched in the list of 21 down-regulated proteins  (affected versus unaffected hepatic segments) reported by GENECODIS
Number of genes
P-value (not corrected for multiple testing)
(KEGG) Urea cycle and metabolism of amino groups
Large scale benchmark of KEGG spider
To support the practical significance of KEGG spider, we collected dozens of recently published experimental studies that reported lists of genes/proteins in various biological contexts. We reanalyzed them using KEGG spider and demonstrated that, in most cases, the models provided by KEGG spider improve our understanding of the genetic basis of metabolism variations. These results can be found at the KEGG spider web site .
Large-scale comparison between KEGG spider and GENECODIS
Proteomic analysis of primary cell lines identifies protein changes present in renal cell carcinoma 
Table 1: proteins found to be differentially expressed between matched normal and RCC primary lines
Proteomic analysis of anaplastic lymphoma cell lines: identification of potential tumour markers 
Table 2: proteins overexpressed in FE-PD cells compared to SU-DHL-1 cells
Differential expression profiling of human pancreatic adenocarcinoma and healthy pancreatic tissue 
Table 3: proteins at higher levels in normal pancreas compared to pancreatic cancer
Proteomic search for potential diagnostic markers and therapeutic targets for ovarian clear cell adenocarcinoma 
Table 1: differentially expressed proteins in human ovarian cancer cells
Quantitative proteomic analysis to discover potential diagnostic markers and therapeutic targets in human renal cell carcinoma 
Table 3: differentially expressed proteins in RCC patients
Protein profile changes in the human breast cancer cell line MCF-7 in response to SEL1L 
Table 4: MCF7-SEL1L differentially expressed genes identified by microarray analysis
Protein dysregulation in mouse hippocampus polytransgenic for chromosome 21 structures in the Down syndrome critical region 
Table 2: list of proteins dysregulated in hippocampus of polytransgenic micea
Differential expression of proteins in response to ceramide-mediated stress signal in colon cancer cells by 2-D gel electrophoresis and MALDI-TOF-MS 
Table 1: list of identified proteins on HCT116 2-DE gels
Subcellular proteome analysis of camptothecin analogue NSC606985-treated acute myeloid leukemic cells 
Table 2: functional classifications of the deregulated proteins in NSC606985-induced apoptotic NB4 Cellsa
Proteome analysis of responses to ascochlorin in a human osteosarcoma cell line by 2-D gel electrophoresis and MALDI-TOF MS 
Table 2: differentially expressed proteins in ascochlorin-treated U2OS cells
Quantitative proteomic and genomic profiling reveals metastasis-related protein expression patterns in gastric cancer cells 
Table 1: summary of differentially expressed proteins and their functional classifications
Proteomic analysis of the resistance to aplidin in human cancer cells 
Table 1: differentially expressed proteins between resistant and wild-type HeLa cells identified in the membrane fraction
Proteomic analysis of the resistance to aplidin in human cancer cells 
Table 2: differentially expressed proteins between resistant and wild-type HeLa cells identified in the cytosolic fraction
Identification of specific protein markers in microdissected hepatocellular carcinoma
Table 2: identified proteins from HCC and nontumorous liver tissue by in-gel digestion and SELDI-MS
Comparison of membrane-associated proteins in human cholangiocarcinoma and hepatocellular carcinoma cell lines 
Table 1: list of proteins from the membrane fraction of HuCCA-1 and HCC-S102 cell lines which show up-regulated expression
Contribution of laser microdissection-based technology to proteomic analysis in hepatocellular carcinoma developing on cirrhosis 
Table 1: proteins differentially expressed in tumorous LM-hepatocytes and total homogenates samples identified PMF
Proteome alterations induced in human white blood cells by consumption of Brussels sprouts: results of a pilot intervention study 
Table 1: protein alterations induced by a controlled dietary intervention with Brussels sprouts in human primary white blood cells
Recent advances in genomics technologies allow for the detection of genes with differential activities between various cell states. Since metabolic processes are at the heart of the cell, they are often subjected to variations in disease cell states. Complete understanding of metabolism variations can give clues to possible metabolism-related treatment of the studied cell disorders. As has been demonstrated, KEGG spider provides a comprehensive interpretation of genomics data related to metabolism variations. In addition, the KEGG spider network models incorporate not only genomics information, but also specify small molecules whose metabolism might be affected. This feature provides a link between genomics and rapidly developing high-throughput metabolomics technologies. It is obvious that experimental studies utilizing both techniques in parallel will become popular in the near future. For such studies, the interpretational models provided by KEGG spider are a useful link between genomics and metabolomics data.
We would like to point out that the idea to infer the network model from a gene list based on external knowledge is not completely new; for example, there are commercial packages available, such as Ingenuity Pathway Analysis software , which transforms a list of genes into a set of networks according to internal database information of gene pairwise relationships. As we already mentioned, some free online tools exist [18–21] that allow one to visualize several metabolic pathways together that are related to the input gene list. However, visual analyses of graphical representations of genes on metabolic pathways gives only an intuitive feeling that discovered genes are related. Taking into account the density of the global gene metabolic network, one must not underestimate the value of the statistical treatment. Even for randomly generated gene lists, it is possible to connect many of genes into a subnetwork through one or two intermediate partners. A beautiful looking figure may have low scientific value without statistical treatment of the presented network model.
To our knowledge not one of the currently existing tools that infer network models from gene lists provides robust statistical treatment of the inferred network models. For example, the statistical scores provided by Ingenuity Pathway Analysis do not take into account the topology of the reference network and provide statistically significant scores even for random gene lists. In contrast, KEGG spider implements a robust statistical treatment of the inferred network models, based on the topology of the global metabolic network, and provides a valid estimate of the p-values by a Monte Carlo simulation procedure. The p-values provided by KEGG spider actually reflect the probability of getting the same size network model for a random gene list.
Examples of analysis of disease-specific genes by KEGG spider suggest that the separation of metabolic reactions into canonical pathways is, to some degree, artificial. In most cases, metabolism-related genes were from several KEGG canonical pathways. However, the analysis with KEGG spider reveals that, if one considers the topology of the global gene metabolic network, these genes form a non-interrupted (a maximum of one or two genes are missing) disease-specific pathway that runs through several canonical pathways. These results also support a hypothesis that disease-specific metabolism variations in most cases are not independent, for example, deregulated genes from different pathways are linked to each other via consecutive one- or two-step metabolic reactions. The examples of analysis of disease-specific genes by KEGG spider presented in Table 3 may serve as support for this hypothesis.
Finally, we would like to summarize the power and limitations of KEGG spider. In comparison to other tools, KEGG spider provides a robust analytical framework for interpretation of gene lists in the context of a global gene metabolic network. The information of gene pairwise relationships is widely exploited (gene A is related to gene B via metabolite C) and the inferred network model is not limited to the size of one metabolic pathway. In the current form, KEGG spider computes the minimal distance between any two genes as a minimal number of steps required to get from one gene to another. A more realistic way to model distance between genes will be a weighted approach where one would consider not only the number of steps but also the impact of each step. This methodological extension can be considered as a possibility for future improvement of KEGG spider. We also would like to point out that the produced output models are limited by the available information on cell metabolism from the KEGG database.
Materials and methods
A global gene metabolic network
The KEGG REACTION database is a collection of chemical structure transformation patterns for substrate-product pairs (reactant pairs). We can build a global 'reaction network' (reactions are nodes, compounds are edges) by connecting with edges reactions that share the same compounds. In general, a reaction consists of multiple reactant pairs, and the one that appears on the KEGG metabolic pathway is called the main pair. To build a global reaction network, we used only compounds classified as main reaction pairs. Otherwise, many reactions will be connected only because they use or produce such compounds as H2O, CO2, and so on.
In KEGG, reactions are linked to orthologous groups of enzymes (KEGG ORTHOLOGY database) and orthologous groups are mapped to the genes (in most cases each orthologous group corresponds to ortholog genes from different genomes). Thus, reactions can be mapped to genes from a given genome, and the reaction network can be transformed into a global organism-specific gene metabolic network, where genes are nodes and compounds are edges, respectively. Some reactions are organism specific or are not annotated by an orthologous group. In this case, they are not present in the corresponding organism-specific gene network. Therefore, the resulting global gene metabolic network links by edges any two genes that are associated with reactions sharing common compounds (from the main reaction pair).
Network inference procedure
The distance between two arbitrary genes is computed as the minimum number of consecutive steps required to get from one gene to another by working through existing paths on the global gene metabolic network. Distance 1 means that two genes are directly connected. Distance 2 means that two genes are connected via one intermediate gene, distance 3 means that two genes are connected via two intermediate genes, and so on. Given a gene list, our purpose is to infer the network model that minimizes the distance between each connected gene pair according to pairwise distances between genes.
Initially, we map genes from the input list onto the global gene metabolic network. At this point all genes from the input list are disconnected. In the first step, we connect by edges gene pairs with distance 1 and look for connected subnetworks. The subnetwork with the maximal number of genes is referred to as an inferred network model D1. We also refer to the number of genes in the maximal subnetwork as the size of the inferred model. In the second step, genes (from the input list) with distance 2 are connected by edges. The subnetwork with the maximal number of genes is inferred and is referred to as network model D2. In a similar way, network models D3, D4,..., are inferred. Models D2, D3,..., incorporate genes that are not from the input list but are added to connect input genes in the network model. We refer to these added genes as intermediate genes.
The null hypothesis is that the input gene list has no bias in relation to the topology of the global gene metabolic network. A quality measure of the inferred network model can be its size, that is, the number of genes from the input list in the model. We have to estimate the probability to infer models with the same or bigger size from randomly generated gene lists of size N, where N is the number of input genes.
Let us assume that we have N genes in the input list. Using the network inference procedure described above, we infer the network models D1, D2, D3. Let us denote S1, S2, S3 to be the number of input genes in the inferred network models D1, D2, D3. The values S1, S2, S3 are used as statistics. To estimate the significance of the inferred model D1, we compare the value S1 with a distribution R1j. In the same way, we estimate the significance of the inferred models D2, D3 by comparing the values S2, S3with distributions R2j, R3j, respectively.
The distributions R1j, R2j, R3jare computed by a random simulation procedure . To generate the background distributions R1j, R2j, R3j, we repeat the following simulation procedure k times. Index j = 1..k specifies the random simulation. Each time the random gene list B j of size N (equal to the size of the input list) is generated. The network inference procedure described above is applied to the list B j and the network models D1j, D2j, D3jare inferred. Let us denote the number of genes from the random list B j in the inferred network models D1j, D2j, D3jas R1j, R2j, R3j. Thus, after repeating k times the simulation procedure, we get the background distribution R1j(j = 1..k) for model D1, the background distribution R2j(j = 1..k) for model D2and the background distribution R3j(j = 1..k) for model D3.
To estimate the significance of the inferred network model D1 for the input gene list, the value S1 is compared to the distribution R1j. Let n be the number of values from the distribution R1jthat are equal or greater than S1. The estimate of the p-value p of the inferred network model D1 is computed as p = (n + 1)/k. In the same way, the p-values for models D2 and D3 are computed using values S2 and S3 and background distributions R2jand R3j. In other words, the p-value is estimated as a share of random simulations where the size of the inferred models for a random gene list (size N) are equal to or greater than the size (S1, S2, S3) of the inferred models for input gene list (size N).
Additional data files
The following additional data files are available with the online version of this paper. Additional data file 1 is a full comparison of KEGG spider to KEGG atlas.
Kyoto Encyclopedia of Genes and Genomes.
We thank Philip Wong for helpful discussions.
- Shi Q, Bao S, Song L, Wu Q, Bigner DD, Hjelmeland AB, Rich JN: Targeting SPARC expression decreases glioma cellular survival and invasion associated with reduced activities of FAK and ILK kinases. Oncogene. 2007, 26: 4084-4094. 10.1038/sj.onc.1210181.View ArticleGoogle Scholar
- Perroud B, Lee J, Valkova N, Dhirapong A, Lin PY, Fiehn O, Kultz D, Weiss RH: Pathway analysis of kidney cancer using proteomics and metabolic profiling. Mol Cancer. 2006, 5: 64-10.1186/1476-4598-5-64.View ArticleGoogle Scholar
- Marquez RT, Baggerly KA, Patterson AP, Liu J, Broaddus R, Frumovitz M, Atkinson EN, Smith DI, Hartmann L, Fishman D, Berchuck A, Whitaker R, Gershenson DM, Mills GB, Bast RC, Lu KH: Patterns of gene expression in different histotypes of epithelial ovarian cancer correlate with those in normal fallopian tube, endometrium, and colon. Clin Cancer Res. 2005, 11: 6116-6126. 10.1158/1078-0432.CCR-04-2509.View ArticleGoogle Scholar
- Loscalzo J, Kohane I, Barabasi AL: Human disease classification in the postgenomic era: a complex systems approach to human pathobiology. Mol Syst Biol. 2007, 3: 124-10.1038/msb4100163.View ArticleGoogle Scholar
- Liu N, Song W, Wang P, Lee K, Chan W, Chen H, Cai Z: Proteomics analysis of differential expression of cellular proteins in response to avian H9N2 virus infection in human cells. Proteomics. 2008, 8: 1851-1858. 10.1002/pmic.200700757.View ArticleGoogle Scholar
- Beer DG, Kardia SL, Huang CC, Giordano TJ, Levin AM, Misek DE, Lin L, Chen G, Gharib TG, Thomas DG, Lizyness ML, Kuick R, Hayasaka S, Taylor JM, Iannettoni MD, Orringer MB, Hanash S: Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med. 2002, 8: 816-824.Google Scholar
- Antonov AV, Mewes HW: Complex functionality of gene groups identified from high-throughput data. J Mol Biol. 2006, 363: 289-296. 10.1016/j.jmb.2006.07.062.View ArticleGoogle Scholar
- Antonov AV, Schmidt T, Wang Y, Mewes HW: ProfCom: a web tool for profiling the complex functionality of gene groups identified from high-throughput data. Nucleic Acids Res. 2008, W347-351. 10.1093/nar/gkn239. 36 Web Server
- Khatri P, Bhavsar P, Bawa G, Draghici S: Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments. Nucleic Acids Res. 2004, 32: W449-W456. 10.1093/nar/gkh409.View ArticleGoogle Scholar
- Khatri P, Draghici S: Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics. 2005, 21: 3587-3595. 10.1093/bioinformatics/bti565.View ArticleGoogle Scholar
- Khatri P, Sellamuthu S, Malhotra P, Amin K, Done A, Draghici S: Recent additions and improvements to the Onto-Tools. Nucleic Acids Res. 2005, 33: W762-W765. 10.1093/nar/gki472.View ArticleGoogle Scholar
- Martin D, Brun C, Remy E, Mouren P, Thieffry D, Jacq B: GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol. 2004, 5: R101-10.1186/gb-2004-5-12-r101.View ArticleGoogle Scholar
- Masseroli M, Martucci D, Pinciroli F: GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining. Nucleic Acids Res. 2004, 32: W293-W300. 10.1093/nar/gkh432.View ArticleGoogle Scholar
- Reimand J, Kull M, Peterson H, Hansen J, Vilo J: g:Profiler - a web-based toolset for functional profiling of gene lists from large-scale experiments. Nucleic Acids Res. 2007, 35: W193-W200. 10.1093/nar/gkm226.View ArticleGoogle Scholar
- Berriz GF, King OD, Bryant B, Sander C, Roth FP: Characterizing gene sets with FuncAssociate. Bioinformatics. 2003, 19: 2502-2504. 10.1093/bioinformatics/btg363.View ArticleGoogle Scholar
- Antonov AV, Mewes HW: Complex phylogenetic profiling reveals fundamental genotype-phenotype associations. Comput Biol Chem. 2008, 32: 412-416. 10.1016/j.compbiolchem.2008.07.003.View ArticleGoogle Scholar
- Doniger SW, Salomonis N, Dahlquist KD, Vranizan K, Lawlor SC, Conklin BR: MAPPFinder: using Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data. Genome Biol. 2003, 4: R7-10.1186/gb-2003-4-1-r7.View ArticleGoogle Scholar
- Pandey R, Guru RK, Mount DW: Pathway Miner: extracting gene association networks from molecular pathways for predicting the biological significance of gene expression microarray data. Bioinformatics. 2004, 20: 2156-2158. 10.1093/bioinformatics/bth215.View ArticleGoogle Scholar
- Goffard N, Weiller G: PathExpress: a web-based tool to identify relevant pathways in gene expression data. Nucleic Acids Res. 2007, 35: W176-W181. 10.1093/nar/gkm261.View ArticleGoogle Scholar
- Adler P, Reimand J, Janes J, Kolde R, Peterson H, Vilo J: KEGGanim: pathway animations for high-throughput data. Bioinformatics. 2008, 24: 588-590. 10.1093/bioinformatics/btm581.View ArticleGoogle Scholar
- Reimand J, Tooming L, Peterson H, Adler P, Vilo J: GraphWeb: mining heterogeneous biological networks for gene modules with functional significance. Nucleic Acids Res. 2008, W452-459. 10.1093/nar/gkn230. 36 Web Server
- Letunic I, Yamada T, Kanehisa M, Bork P: iPath: interactive exploration of biochemical pathways and networks. Trends Biochem Sci. 2008, 33: 101-103. 10.1016/j.tibs.2008.01.001.View ArticleGoogle Scholar
- Okuda S, Yamada T, Hamajima M, Itoh M, Katayama T, Bork P, Goto S, Kanehisa M: KEGG Atlas mapping for global analysis of metabolic pathways. Nucleic Acids Res. 2008, 36: W423-W426. 10.1093/nar/gkn282.View ArticleGoogle Scholar
- Rahnenfuhrer J, Domingues FS, Maydt J, Lengauer T: Calculating the statistical significance of changes in pathway activity from gene expression data. Stat Appl Genet Mol Biol. 2004, 3: Article16Google Scholar
- Draghici S, Khatri P, Tarca AL, Amin K, Done A, Voichita C, Georgescu C, Romero R: A systems biology approach for pathway level analysis. Genome Res. 2007, 17: 1537-1545. 10.1101/gr.6202607.View ArticleGoogle Scholar
- KEGG Spider. [http://mips.gsf.de/proj/keggspider]
- Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Geer LY, Helmberg W, Kapustin Y, Kenton DL, Khovayko O, Lipman DJ, Madden TL, Maglott DR, Ostell J, Pruitt KD, Schuler GD, Schriml LM, Sequeira E, Sherry ST, Sirotkin K, Souvorov A, Starchenko G, Suzek TO, Tatusov R, Tatusova TA, Wagner L, Yaschenko E: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2006, 34: D173-D180. 10.1093/nar/gkj158.View ArticleGoogle Scholar
- Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Geer LY, Kapustin Y, Khovayko O, Landsman D, Lipman DJ, Madden TL, Maglott DR, Ostell J, Miller V, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Sirotkin K, Souvorov A, Starchenko G, Tatusov RL, Tatusova TA, Wagner L, Yaschenko E: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2007, 35: D5-12. 10.1093/nar/gkl1031.View ArticleGoogle Scholar
- Birney E, Andrews D, Caccamo M, Chen Y, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Flicek P, Graf S, Hammond M, Herrero J, Howe K, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Kokocinski F, Kulesha E, London D, Longden I, Melsopp C, Meidl P, Overduin B, et al: Ensembl 2006. Nucleic Acids Res. 2006, 34: D556-D561. 10.1093/nar/gkj133.View ArticleGoogle Scholar
- Pruitt KD, Tatusova T, Maglott DR: NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007, 35: D61-D65. 10.1093/nar/gkl842.View ArticleGoogle Scholar
- Liu G, Loraine AE, Shigeta R, Cline M, Cheng J, Valmeekam V, Sun S, Kulp D, Siani-Rose MA: NetAffx: Affymetrix probesets and annotations. Nucleic Acids Res. 2003, 31: 82-86. 10.1093/nar/gkg121.View ArticleGoogle Scholar
- Hooper SD, Bork P: Medusa: a simple tool for interaction graph analysis. Bioinformatics. 2005, 21: 4432-4433. 10.1093/bioinformatics/bti696.View ArticleGoogle Scholar
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13: 2498-2504. 10.1101/gr.1239303.View ArticleGoogle Scholar
- Carmona-Saez P, Chagoyen M, Tirado F, Carazo JM, Pascual-Montano A: GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists. Genome Biol. 2007, 8: R3-10.1186/gb-2007-8-1-r3.View ArticleGoogle Scholar
- Jinawath N, Furukawa Y, Hasegawa S, Li M, Tsunoda T, Satoh S, Yamaguchi T, Imamura H, Inoue M, Shiozaki H, Nakamura Y: Comparison of gene-expression profiles between diffuse- and intestinal-type gastric cancers using a genome-wide cDNA microarray. Oncogene. 2004, 23: 6830-6844. 10.1038/sj.onc.1207886.View ArticleGoogle Scholar
- Nabetani T, Tabuse Y, Tsugita A, Shoda J: Proteomic analysis of livers of patients with primary hepatolithiasis. Proteomics. 2005, 5: 1043-1061. 10.1002/pmic.200401039.View ArticleGoogle Scholar
- Examples, KEGG spider. [http://mips.gsf.de/proj/keggspider/example.KEGG.html]
- Ingenuity Pathway Analysis Software. [http://www.ingenuity.com/products/pathways_analysis.html]
- Westfall PN, Young SS: Resampling-based Multiple Testing: Examples and Methods for p-Value Adjustment. 1993, New York: John Wiley & SonsGoogle Scholar
- Craven RA, Stanley AJ, Hanrahan S, Dods J, Unwin R, Totty N, Harnden P, Eardley I, Selby PJ, Banks RE: Proteomic analysis of primary cell lines identifies protein changes present in renal cell carcinoma. Proteomics. 2006, 6: 2853-2864. 10.1002/pmic.200500549.View ArticleGoogle Scholar
- Cussac D, Pichereaux C, Colomba A, Capilla F, Pont F, Gaits-Iacovoni F, Lamant L, Espinos E, Burlet-Schiltz O, Monsarrat B, Delsol G, Payrastre B: Proteomic analysis of anaplastic lymphoma cell lines: identification of potential tumour markers. Proteomics. 2006, 6: 3210-3222. 10.1002/pmic.200500647.View ArticleGoogle Scholar
- Lu Z, Hu L, Evers S, Chen J, Shen Y: Differential expression profiling of human pancreatic adenocarcinoma and healthy pancreatic tissue. Proteomics. 2004, 4: 3975-3988. 10.1002/pmic.200300863.View ArticleGoogle Scholar
- Morita A, Miyagi E, Yasumitsu H, Kawasaki H, Hirano H, Hirahara F: Proteomic search for potential diagnostic markers and therapeutic targets for ovarian clear cell adenocarcinoma. Proteomics. 2006, 6: 5880-5890. 10.1002/pmic.200500708.View ArticleGoogle Scholar
- Okamura N, Masuda T, Gotoh A, Shirakawa T, Terao S, Kaneko N, Suganuma K, Watanabe M, Matsubara T, Seto R, Matsumoto J, Kawakami M, Yamamori M, Nakamura T, Yagami T, Sakaeda T, Fujisawa M, Nishimura O, Okumura K: Quantitative proteomic analysis to discover potential diagnostic markers and therapeutic targets in human renal cell carcinoma. Proteomics. 2008, 8: 3194-3203. 10.1002/pmic.200700619.View ArticleGoogle Scholar
- Bianchi L, Canton C, Bini L, Orlandi R, Menard S, Armini A, Cattaneo M, Pallini V, Bernardi LR, Biunno I: Protein profile changes in the human breast cancer cell line MCF-7 in response to SEL1L gene induction. Proteomics. 2005, 5: 2433-2442. 10.1002/pmic.200401283.View ArticleGoogle Scholar
- Shin JH, Gulesserian T, Verger E, Delabar JM, Lubec G: Protein dysregulation in mouse hippocampus polytransgenic for chromosome 21 structures in the Down syndrome critical region. J Proteome Res. 2006, 5: 44-53. 10.1021/pr050235f.View ArticleGoogle Scholar
- Fillet M, Cren-Olive C, Renert AF, Piette J, Vandermoere F, Rolando C, Merville MP: Differential expression of proteins in response to ceramide-mediated stress signal in colon cancer cells by 2-D gel electrophoresis and MALDI-TOF-MS. J Proteome Res. 2005, 4: 870-880. 10.1021/pr050006t.View ArticleGoogle Scholar
- Yu Y, Wang LS, Shen SM, Xia L, Zhang L, Zhu YS, Chen GQ: Subcellular proteome analysis of camptothecin analogue NSC606985-treated acute myeloid leukemic cells. J Proteome Res. 2007, 6: 3808-3818. 10.1021/pr0700100.View ArticleGoogle Scholar
- Kang JH, Park KK, Lee IS, Magae J, Ando K, Kim CH, Chang YC: Proteome analysis of responses to ascochlorin in a human osteosarcoma cell line by 2-D gel electrophoresis and MALDI-TOF MS. J Proteome Res. 2006, 5: 2620-2631. 10.1021/pr060111i.View ArticleGoogle Scholar
- Chen YR, Juan HF, Huang HC, Huang HH, Lee YJ, Liao MY, Tseng CW, Lin LL, Chen JY, Wang MJ, Chen JH, Chen YJ: Quantitative proteomic and genomic profiling reveals metastasis-related protein expression patterns in gastric cancer cells. J Proteome Res. 2006, 5: 2727-2742. 10.1021/pr060212g.View ArticleGoogle Scholar
- Gonzalez-Santiago L, Alfonso P, Suarez Y, Nunez A, Garcia-Fernandez LF, Alvarez E, Munoz A, Casal JI: Proteomic analysis of the resistance to aplidin in human cancer cells. J Proteome Res. 2007, 6: 1286-1294. 10.1021/pr060430+.View ArticleGoogle Scholar
- Melle C, Ernst G, Scheibner O, Kaufmann R, Schimmel B, Bleul A, Settmacher U, Hommann M, Claussen U, von EF: Identification of specific protein markers in microdissected hepatocellular carcinoma. J Proteome Res. 2007, 6: 306-315. 10.1021/pr060439b.View ArticleGoogle Scholar
- Santos AD, Demaugre F: Contribution of laser microdissection-based technology to proteomic analysis in hepatocellular carcinoma developing on cirrhosis. Proteomics Clin Appl. 2007, 1: 545-554. 10.1002/prca.200600474.View ArticleGoogle Scholar
- Hoelzl C, Lorenz O, Haudek V, Gundacker N, Knasmüller S, Gerner C: Proteome alterations induced in human white blood cells by consumption of Brussels sprouts: Results of a pilot intervention study. Proteomics Clin Appl. 2008, 108-117. 10.1002/prca.200780100.Google Scholar
- KEGG Atlas. [http://www.genome.jp/kegg/atlas]
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.