RedeR: R/Bioconductor package for representing modular structures, nested networks and multiple levels of hierarchical associations
© Castro et al.; licensee BioMed Central Ltd. 2012
Received: 6 February 2012
Accepted: 24 April 2012
Published: 24 April 2012
Visualization and analysis of molecular networks are both central to systems biology. However, there still exists a large technological gap between them, especially when assessing multiple network levels or hierarchies. Here we present RedeR, an R/Bioconductor package combined with a Java core engine for representing modular networks. The functionality of RedeR is demonstrated in two different scenarios: hierarchical and modular organization in gene co-expression networks and nested structures in time-course gene expression subnetworks. Our results demonstrate RedeR as a new framework to deal with the multiple network levels that are inherent to complex biological systems. RedeR is available from http://bioconductor.org/packages/release/bioc/html/RedeR.html.
Biological networks contain modules of genes or proteins that may function in the same pathway . As genes or proteins inside a module can be co-regulated, they are often represented by one single node in the network . Such modules can be inferred by a number of statistical methods and the results are usually represented in graphs [3, 4]. Given the complex associations that can take place in these graphs, it is a challenge to infer and visualize multiple levels or hierarchies within and between subnetwork structures.
Popular software like Cytoscape  provide a general framework to deal with part of this complexity by providing software plugins and visualizing networks in flat topologies. Flat networks are largely adequate to deal with different graph elements, as long as the network levels stay small. However, when describing and defining functional modules a hierarchical data structure is more appropriate because this enables the construction of graph elements within modules in a scalable system (for example, chains of nested networks). Herein we present RedeR, an R package combined with a Java core engine to cope with hierarchical and nested network structures.
RedeR is designed to deal with three key challenges in network analysis. Firstly, biological networks are modular and hierarchical, so network visualization needs to take advantage of such structural features to avoid cluttered and uninformative 'hairballs'. Secondly, network analysis relies on statistical methods, many of which are already available in resources like CRAN or Bioconductor. However, the missing link between advanced visualization and statistical computing makes it hard to take full advantage of R packages for network analysis. Thirdly, in larger networks user input is needed to focus the view of the network on the biologically relevant parts, rather than relying on an automatic layout function.
RedeR is designed to address these challenges: (i) we implement modular objects for subnetworks that allow to easily lay out and analyze network modules and their connections; (ii) the software is tightly integrated to R - while RedeR visualizes R outputs, its results can be directly fed back into R for further statistical analyses, which makes the power of R available for users primarily interested in visualization but not statistical computing; and (iii) we implement a dynamic layout that directly reflects user input.
We exemplify RedeR's visualization and analysis capabilities in a case study based on the re-analysis of gene expression and chromatin immunoprecipitation (ChIP)-on-chip data from an estrogen receptor (ER) study in the MCF-7 breast cancer cell line . We anticipate that RedeR will be useful for integrative analyses and deriving gene expression networks that demand complex data abstraction and multiple network levels.
Overview of the software
RedeR is distributed as an R/Bioconductor package. It is implemented by S4 classes in R  combined with Java graphical user interface. Standard Java Swing components and the NetBeans IDE 6.9 development environment  were extensively used to implement the graphical interface, which operates in conjunction with R libraries. In what follows we describe the implementation of the main features of the software.
User-friendly interface in R
# Set the server port and invoke the Java app
> rdp <- RedPort()
This method sets the environment and all paths required to start the callback engine, after which the software can either interact with R or run as a stand-alone application. The graphic interface is extensively controllable from the R command line and provides several menus that allow basic actions, such as selecting nodes and changing their appearance. In order to maintain a high level of compatibility, all methods in the R interface use igraph objects as prototype data format.
Unique data structure for hierarchical networks
# Generate an igraph object (a toy example with modular structures)
> g <- gtoy.rm(m = 5, nmax = 30)
# Compute a hierarchical clustering using standard R functions
# Add graph to RedeR
> addGraph(rdp, g)
> nesthc(rdp, hc, metric="rootdist", cutlevel = 3, nlev = 1)
This toy example maps one level of the hierarchy onto the network topology. Additional levels and different sections of the hierarchy can be mapped using the same function (for further details, please see 'nesthc' documentation in the R package).
Dynamic layout modeling
One of the most versatile features of the software is the ability to deal with nested network objects using dynamic modeling, which makes it possible to represent, for example, subnetworks and time-series onto the same graph in a user-friendly routine. The layout uses force-directed algorithms as described elsewhere [13, 14]. Here we adapted the method to deal with nested networks. In force-directed graphs, each edge can be regarded as a spring - with a given target length - and can either exert a repulsive or attractive force on the connected nodes, while nodes are analogous to mutually repulsive charged particles that move according to the applied forces. In RedeR, the layout is additionally constrained by the hierarchical structure. For example, a nested node is constrained to its parent node by opposing forces applied by the nest, which is regarded as a special node whose nested objects can reach a local equilibrium independently from other network levels. The layout is adjusted by global options and evolves iteratively (and interactively) until the system reaches the equilibrium state. It can be started via either the graphical user interface or the R command line, as for example:
# Start dynamic layout
> relax(rdp, ps=TRUE)
# Reset graph
The user can observe all steps of the layout optimization process and, at any particular time, the process can be driven interactively. In this sense, 'dynamic' not only refers to the iteration steps required to layout a graph by the force-direct algorithm but also is related to the user's interaction. This option is particularly useful for additional control over containers and nested nodes in hierarchical structures. We also added to the Java core some popular static layout algorithms from open source libraries  as a complementary option to the list of all static layouts that can either be found in the R package collections or, as usual, customized in R by the user.
R code deployment
R developers can deploy R code to RedeR by using the 'PluginBuilder' method. This feature provides a direct way to extend existing R packages to the Java interface. The combination of R and Java code in a mark-up construct gives rise to this extensible feature. The idea is based on the successful framework used by the Sweave package that mixes LaTeX syntax and R codes in order to parse R text chunks within LaTeX documents. In RedeR, the plugins are exported to the Java core by the 'submitPlugin' function. On the other side of the interface, the software receives the request, stores the new method in an XML document and mounts the plugin in the application, including submenus in the main panel. RedeR plugins have two main sections: methods and add-ons. The 'methods' section can be regarded as the plugin trigger. When installed in the Java app, this trigger starts a given analysis by unfolding R expressions wrapped in the methods. Add-ons use the same strategy, but remain hidden in the app and can either load formal functions or pass additional arguments to R (a code sample is provided in RedeR vignette, plugin builder tutorial).
Pre-processed data and documentation
The pre-processed data used in the case study were obtained by the LIMMA package . An R script that reproduces the analysis is available in the supplements. Additionally, the R package provides extensive documentation for all methods available in the software, including description of the data objects, examples, and a tutorial introducing the main functionalities.
In this section we demonstrate some essential features of the software in two examples based on the re-analysis of ChIP-on-chip and gene expression data from a genome-wide study describing ER binding sites in the MCF-7 breast cancer cell line . The ChIP-on-chip dataset consists of a Bed file containing the genome position of 3,665 unique ER binding sites, while the gene expression data consist of 12 time-course Affymetrix U133Plus2.0 microarrays from MCF-7 cells stimulated with estrogen for 0, 3, 6 and 12 h (all arrays in triplicate).
The purpose of the study by Carroll et al.  is the identification of new authentic cis ER binding sites and ER target genes in breast cancer cells. One of the challenges faced by the authors was that only a small fraction (4%) of the ER binding sites mapped to promoter-proximal regions, within 1 kb of the transcription start sites. More frequently, ER binding sites are found at considerable distance from the regulated gene and only one-third of early estrogen up-regulated genes contain ER binding sites within 50 kb of the transcription start site. This finding has made it difficult to validate ER-regulated candidate genes as there may be multiple genes within the 100 kb interval of the ER binding site and because the usual association of transcription factor binding sites and promoter regions occurs in only a minority of cases.
Hierarchical and modular organization in gene co-expression networks
Taken together, this case study not only illustrates how to constrain the network topology by a hierarchical structure, but also raises an interesting biological observation. The identification of co-regulated gene modules is one of the key steps towards understanding genetic regulatory networks. However, similar patterns in gene expression modules are not directly associated with a common mechanism of gene regulation. The identification of co-regulated modules is far from trivial and this case study provides a simple workflow to inspect in detail potentially co-expressed gene modules that share binding sites for the same transcription factor. The software permits visualizing these individual gene modules, displaying each individual component and the connections between them, as well as the hierarchical associations between modules and genes.
Nested structures in time-course gene expression subnetworks
RedeR in the context of gold-standard software in the same field
RedeR in the context of gold-standard network visualization software and R
Hierarchical data structurea
R <-> Java
R -> Java
R <- DOT
R <- C
Deployment to R
Plugin coding language
DOT language c
Scalability on nested networkse
Interactive graph handling
Comparison across multiple nested networks
Another option is the package RCytoscape . This R package implements via CytoscapeRPC  an interface to Cytoscape , which can be regarded as a gold standard software for network visualization. Although robust and easy to use, Cytoscape is designed mainly to deal with flat network topology, which does not accommodate increasing amounts of nested objects. For example, using flat topology to represent a chain of nested networks, the number of graphs would increase proportionally with the network levels. Using RedeR, the job can be performed in just one graph (Figure 2b, data structure section). In this sense, RedeR constitutes a new option to assess networks with multiple levels or hierarchies, and this is a surprisingly common situation in biological networks.
In this work we introduced RedeR, a software designed for the representation of nested and hierarchical biological networks. The ability to perform advanced visualization tightly integrated to R allows RedeR to take full advantage of R packages for network analysis and statistical computing. Likewise, RedeR is an ongoing project that provides a comprehensive and entirely new framework to read, write and manipulate R code mixed to a Java data structure. Its architecture allows the creation of R-based plugins with minimum effort, potentially extending the existing R packages to different communities of users interested in studying biological networks.
Rather than analyzing a single network, current research focuses on differences in networks, re-wiring events, as well as higher-level, modular characteristics of networks. These can be hard to visualize in standard tools. RedeR implements a framework for network comparison and module representation by introducing a hierarchy of 'containers' in which many networks and their connections can be visualized at the same time. We anticipate that our software will be particularly useful to assess datasets that demand detailed analysis of inter- and intra-modular associations.
R (version>=2.14) and Java Runtime Environment (version>=5). Available since Bioconductor 2.9.
directed acyclic graph
We thank Professor Sir Bruce Ponder for his support. We also thank all FM lab members that kindly contributed with suggestions during the development of the R package. We acknowledge the support of The University of Cambridge, Cancer Research UK and Hutchison Whampoa Limited. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
- Luo F, Yang Y, Chen CF, Chang R, Zhou J, Scheuermann RH: Modular organization of protein interaction networks. Bioinformatics. 2007, 23: 207-214. 10.1093/bioinformatics/btl562.PubMedView ArticleGoogle Scholar
- Han JD: Understanding biological functions through molecular networks. Cell Res. 2008, 18: 224-237. 10.1038/cr.2008.16.PubMedView ArticleGoogle Scholar
- Aittokallio T, Schwikowski B: Graph-based methods for analysing networks in cell biology. Brief Bioinform. 2006, 7: 243-255. 10.1093/bib/bbl022.PubMedView ArticleGoogle Scholar
- Barabasi AL, Gulbahce N, Loscalzo J: Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011, 12: 56-68. 10.1038/nrg2918.PubMedPubMed CentralView ArticleGoogle Scholar
- Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T: Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics. 2011, 27: 431-432. 10.1093/bioinformatics/btq675.PubMedPubMed CentralView ArticleGoogle Scholar
- Carroll JS, Meyer CA, Song J, Li W, Geistlinger TR, Eeckhoute J, Brodsky AS, Keeton EK, Fertuck KC, Hall GF, Wang Q, Bekiranov S, Sementchenko V, Fox EA, Silver PA, Gingeras TR, Liu XS, Brown M: Genome-wide analysis of estrogen receptor binding sites. Nat Genet. 2006, 38: 1289-1297. 10.1038/ng1901.PubMedView ArticleGoogle Scholar
- R Development Core Team: R: A Language and Environment for Statistical Computing. 2011, Vienna: R Foundation for Statistical ComputingGoogle Scholar
- NetBeans IDE 6.9 Development Environment. [http://netbeans.org/]
- Apache xmlrpc Webserver. [http://ws.apache.org/xmlrpc/]
- JRI Library Interface. [http://www.rforge.net/JRI/]
- Lang DT: XMLRPC: Remote Procedure Call (RPC) via XML in R. [http://www.omegahat.org/XMLRPC/]
- Urbanek S: rJava: Low-level R to Java interface. [http://www.rforge.net/rJava/]
- Brandes U: Drawing graphs: methods and models. Lecture Notes in Computer Science. Edited by: Kaufmann M, Wagner D. 2001, Heidelberg: Springer, 2025: 71-86. 10.1007/3-540-44969-8_4.Google Scholar
- Fruchterman TMJ, Reingold EM: Graph drawing by force-directed placement. Software Practice Experience. 1991, 21: 1129-1164. 10.1002/spe.4380211102.View ArticleGoogle Scholar
- Java Universal Network/Graph Framework. [http://sourceforge.net/projects/jung/]
- Smyth GK: Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004, 3: Article3-PubMedGoogle Scholar
- Prasad TS, Kandasamy K, Pandey A: Human Protein Reference Database and Human Proteinpedia as discovery tools for systems biology. Methods Mol Biol. 2009, 577: 67-79. 10.1007/978-1-60761-232-2_6.PubMedView ArticleGoogle Scholar
- Le Meur N, Gentleman R: Analyzing biological data using R: methods for graphs and networks. Methods Mol Biol. 2012, 804: 343-373. 10.1007/978-1-61779-361-5_19.PubMedView ArticleGoogle Scholar
- Csardi G, Nepusz T: The igraph software package for complex network research. R package version 0.5.5-2. [http://cran.r-project.org/web/packages/igraph/index.html]
- Gentleman R, Whalen E, Huber W, Falcon S: graph: a package to handle graph data structures. R package version 1.30.30. [http://bioconductor.org/packages/release/bioc/html/graph.html]
- Gentry J, Long L, Gentleman R, Falcon S, Hahne F, Sarkar D, Hansen K: Rgraphviz: Provides plotting capabilities for R graph objects. R package version 1.30.31. [http://bioconductor.org/packages/release/bioc/html/Rgraphviz.html]
- Shannon P: RCytoscape. R package version 1.3.0. [http://bioconductor.org/packages/release/bioc/html/RCytoscape.html]
- Bot JJ, Reinders MJ: CytoscapeRPC: a plugin to create, modify and query Cytoscape networks from scripting languages. Bioinformatics. 2011, 27: 2451-2452. 10.1093/bioinformatics/btr388.PubMedView ArticleGoogle Scholar
- Norel R, Rice JJ, Stolovitzky G: The self-assessment trap: can we all be better than average?. Mol Syst Biol. 2011, 7: 537-PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.