The RBPome: where the brains meet the brawn
© BioMed Central Ltd 2014
Published: 31 January 2014
Skip to main content
© BioMed Central Ltd 2014
Published: 31 January 2014
Recent celebrations of the double helix, on the 60th anniversary of its discovery, heralded DNA as the key to life: the universal set of instructions for all organisms, the ubiquitous stuff of inheritance, without which our planet would be sterile. Although Genome Biology was among the culprits of this DNA-centric effervescence [1, 2], our focus on DNA within the context of life as we know it belied the widely supported hypothesis that DNA has not always been integral to life, and that life’s earliest forms existed in a pre-DNA world. Francis Crick, co-discoverer of DNA’s double helical structure, devised a schema - with which you will undoubtedly be familiar - known as the Central Dogma, in which information flows from DNA to RNA to protein (but see  for an updated version). Of these three molecules, it is very likely that DNA is the new kid on the block, a crouton freshly added to the primordial soup. By contrast, the 'RNA world’ hypothesis holds that RNA is the oldest extant genetically encoded macromolecule, and that it coexisted with proteins - and, prior to that, amino acids and peptides - before DNA arrived on the scene.
And so there we have it, RNA and proteins are old friends, who have learned to live intertwined with one another for longer than either has with DNA. By the nature of evolution, the co-dependency of these two molecules at the very early stages of life means that many of their interactions have become embedded in the fabric of the cell’s most critical processes, inherited by the DNA world from its RNA predecessor. Famous examples of course include protein translation, through the ribosome, and RNA splicing, through the spliceosome (but see  for a dissenting, or at least more nuanced, view).
For such old friends, however, the interface where RNA and RNA-binding proteins (RBPs) meet - 'the RBPome’ - remains remarkably unexplored, even though the more we search for RNA–protein interactions, the more we find them. This thematic issue of Genome Biology, dedicated to the RBPome, humbly sets out to take a small step toward bringing this neglected component of Crick’s Central Dogma out into the light. We are assisted in this endeavor by Jernej Ule (University College London, UK) and John Rinn (Broad Institute, USA), the RBPome issue’s talented, dedicated and altogether delightful Guest Editors.
The neglect of the RBPome is not so much due to the disfavor of scientists as it is due to the paucity of methods available for studying RNA–protein interactions. However, a number of assays have been developed in recent years that are beginning to make the study of the RBPome more attractive, as reviewed in this issue by Mitchell Guttman and colleagues .
Guest Editor Jernej Ule has led the way in RBPome assay development with CLIP (sometimes known as HITS-CLIP when combined with high-throughput sequencing) and iCLIP (compared in ), which are immunoprecipitation-based methods for transcriptome-wide profiling of RNA-binding sites for a given RNA-binding protein. Another popular, and related, method is PAR-CLIP , published by the groups of Mihaela Zavolan and Thomas Tuschl, who both contribute to this issue (  and [9, 10], respectively).
The emergence of these high-throughput RBPome-mapping methods awaits a bioinformatics catch-up, with few tools available for data analysis. In this issue, three new user-friendly CLIP bioinformatics methods are described: the comparative tool dCLIP , and the two Galaxy-based tools pyCRAC  and PIPE-CLIP . One of the most popular existing tools for PAR-CLIP data analysis is PARalyzer , from the groups of Jack Keene and Uwe Ohler, who both contribute to this issue (  and , respectively).
An important but specialized application of CLIP and related methods is the study of microRNA (miRNA) binding sites, through the immunoprecipitation of Argonaute 2, a protein that forms complexes with miRNAs and their RNA targets. For example, such an approach is used in this issue to guide the examination of miRNA targets in breast cancer subtypes . The use of RBPome methods to map small RNA binding to the transcriptome is reviewed by Mihaela Zavolan and Nitish Mittal .
As immunoprecipitation-based assays, the CLIP suite of methods might be expected to suffer from the technical limitation of background binding. A study from Matthew Friedersdorf and Jack Keene clearly illustrates the potential for a background problem in PAR-CLIP data, but also shows how correcting for it can enhance the power of the assay to accurately map the RBPome .
A second approach for studying the RBPome looks at the global RBP footprint on the transcriptome; that is, the totality of RNA sequences bound by RBPs, rather than the binding profile of a single RBP. This strategy was pioneered in mammalian cells by the groups of Markus Landthaler, Christoph Dieterich, Matthias Hentze and Jeroen Krijgsveld ( [16, 17]; reviewed in ), and was explored last year in Genome Biology in a yeast setting . In this issue, Landthaler and Dieterich extend their previous study by assaying a second mammalian cell line and developing a bioinformatics approach, POPPI, for analyzing these datasets and the differences between them . In addition, Brian Gregory and Guest Editor John Rinn describe a new method, PIP-seq, that uses RNase digestion to globally profile the RBP footprint and that has the power to identify RBP-binding motifs .
Although all RNAs are relevant to the RBPome, the interaction of RBPs with mRNAs is of most interest for the large portion of the genomics community concerned with mechanisms of gene regulation. Armed with improved methods, many researchers are now trying to understand more about exactly how these binding events enable RBPs to regulate the mRNA component of the transcriptome. The many mechanisms by which they do so include promoting or repressing the splicing, stabilizing or degrading of mRNA molecules, as well as editing or modifying individual RNA bases. In this way, the muscle of the proteome bashes into shape the intelligentsia of the cell (in the form of the transcriptome). One might say that the mRNA RBPome is very much where the RNA brain meets the RBP brawn.
Smaug is a Drosophila RBP that was previously known to regulate two target RNAs through two different mechanisms: in one case it promoted destabilization, while in the other it repressed translation. In this issue, Craig Smibert, Howard Lipshitz and colleagues use RIP-chip to profile transcriptome-wide Smaug binding targets in Drosophila early embryos, and perform experiments to identify mRNAs whose translation is repressed or whose stability is reduced by Smaug . They show that Smaug employs a two-pronged approach for many of its targets, by using both its mechanisms to achieve downregulation. Smaug binds a restricted set of mRNAs, with genes targeted including those functioning in the control of protein folding and degradation, lipid droplets and metabolism. Elmar Wahle and Michael Götze’s Research Highlight discusses the study in more detail .
An instructive example of the ancient nature of the RBPome is the protein family made up of the Sm and Sm-like RBPs, which are conserved throughout all three domains of life. These proteins form heterogeneous nuclear ribonucleoproteins (hnRNPs) best known for their function in RNA splicing. In this issue, Gregory Matera and colleagues profile the RNAs bound to a number of Sm and Sm-like proteins in two very different eukaryotic settings: Drosophila ovaries and HeLa cells . The detection of mature mRNAs in RIP-seq experiments suggests splicing-independent functions for these hnRNPs.
ZFP36 is an RBP known to interact with AU-rich sequences and to have an antagonistic function to the better characterized transcript-stabilizing RBP ELAV1. In this issue, Neelanjan Mukherjee, Uwe Ohler and colleagues use PAR-CLIP and overexpression experiments to further probe the role of ZFP36 in negatively regulating the transcriptome, finding that among its targets are transcripts related to immune function and cancer, as well as those encoding other RBPs . A comparison with ELAV1 data makes clear the large number of overlapping bindings sites between the two RBPs, and points toward the similar but non-identical motif preferences underlying their partial overlap.
Rather than directly probing binding sites, RBPs that regulate splicing can instead be investigated through experiments that perturb or enhance their function. In this issue, such an approach is used both to study hnRNPLL in T lymphocytes  and the Sm-like protein LSm5 in salt-stressed Arabidopsis. Both RBPs are found to be linked to the particular form of splicing known as intron retention, but in opposite directions.
Whereas alternative splicing is popularly thought to amount to either a different choice in exons or a shift in exon boundaries, more and more reports are emerging of alternative splicing events where a transcript’s exons are unchanged but an intron is retained. The purpose of intron retention is not altogether clear - in fact, whether or not it is even a deliberate regulatory process, rather than leaky RNA metabolism, is itself a matter of debate. For this reason, studies providing additional data on how RBPs relate to intron retention are very welcome.
Depletion of LSm5 in Arabidopsis, for example, increases intron retention, suggesting that LSm5 functions to promote splicing fidelity, and that intron retention in these instances is an unwanted result of cellular error . However, another plant study in this issue, which examines light-responsive global splicing changes in the moss Physcomitrella patens - the first report to our knowledge of light-regulated splicing in plants - contains results consistent with intentional intron retention, in that these events are restricted to transcripts with a narrow set of functions, including light signaling and splicing . Similarly, and contrary to the effect of LSm5, the hnRNPLL study finds that this developmentally regulated RBP actually induces intron retention, and that it does so in specific transcripts . Moreover, the specificity of intron retention by hnRNPLL is not limited to target transcripts, but is also apparent in the location of the affected introns within these transcripts, with a preference for those flanking alternative exons.
Bioinformatics methods for studying the RBPome, including those described in this issue, are not limited to tools designed for the analysis of CLIP data. Gene expression and other omics data can be exploited to make indirect inferences about RBPs, perhaps with even greater functional clarity than CLIP analysis, as can be seen in a number of research studies presented in this issue [10, 22, 25–29]. Two methods for making such inferences about RBPs are also published in the issue, and both relate to splice site analysis.
Matteo Cereda, Guest Editor Jernej Ule and colleagues present RNAmotifs, a bioinformatics method that identifies alternative splicing-related tetramer motifs from gene expression data . At the heart of RNAmotifs is the principle that the binding of splice-regulatory RBPs to RNA is position dependent, which enables maps of enriched motifs to be generated where data are available for an RBP’s differentially regulated transcriptome. Applying RNAmotifs maps to a number of RBPs shows that the positions at which they operate within transcripts tend to be very similar, but that the effect on splicing varies according to the RBP.
As reviewed by Jeremy Sanford and Timothy Sterne-Weiler , aberrant splicing has been linked to a number of human diseases. Sanford contributes to a Software article in this issue, from Matthew Mort and Sean Mooney, that describes MutPred Splice, a machine-learning tool for predicting those disease-associated variants that are likely to disrupt splicing . Analyzing a dataset of disease-associated exonic variants with MutPred Splice suggests that different types of splicing defects are more common in inherited disease and cancer, respectively.
In parallel to the progress made in methods for experimental mapping of RNA-protein interactions, the development of bioinformatics tools has improved the in silico prediction of these interactions. One such tool, catRAPID , is deployed in this issue by Gian Gaetano Tartaglia and colleagues in an effort to relate RBPome interactions to gene expression. Using knockdown data for two RBPs, this approach finds that the propensity for a target RNA and an RBP to interact is predictive of the strength of target RNA expression change upon RBP depletion . Discussed in Bojan Zagrovic’s Research Highlight  is the subsequent global analysis, which uses RBPome-wide data to show that correlation with target RNA expression is a general trend for RBPs, and that this correlation can be either positive or negative. Target transcripts with positive correlations are associated with different cellular functions to those with negative correlations.
The abundance of those mRNAs encoding RBPs themselves is the focus of a study from the Janga laboratory , who find that these transcripts are more likely to be highly expressed in cancers than are other classes of genes. A network analysis considers what the consequences of this high expression might be for the RBPome.
In addition to regulating splicing and mRNA abundance, RBPs can also modulate the transcriptome through editing and modifying bases. The epitranscriptome is not addressed in this issue, but has been elsewhere in Genome Biology[3, 35, 36]. However, RNA editing - a function of the ADAR family of RBPs  - is the subject of a comparative genomics computational study by Erez Levanon and colleagues . An analysis of public human and mouse data, together with zebrafish as an outlier, shows that only a very small number of A-to-I editing sites are conserved in mammalian transcriptomes - just 59.
Levanon and colleagues note that conserved mammalian RNA editing sites tend to be located in transcripts with a synaptic function, suggesting that synapses may have a particular tendency to use RNA editing as a regulatory mechanism. An accompanying Research Highlight by Robert Reenan and Yiannis Savva discusses the study’s findings in the wider context of the RNA editing field .
Whereas transcription factors largely bind to the genome by forming contacts specific to the primary DNA sequence, RBPs have much more scope to achieve specificity through secondary structure, thanks to the large number of intramolecular bonds that twist RNA molecules into hairpins, stem-loops and various other bumps and bulges.
Rolf Backofen and colleagues use both structure and sequence features of RNA to develop a machine learning-based approach for predicting RBP binding sites from CLIP data . In doing so, they demonstrate the importance of RNA secondary structure for conferring binding specificity in a subset of RBPs, while noting that structure does not make a strong impact on binding preference for some other RBPs.
A second structure-centered method to appear in the issue is that described by Hisanori Kiryu and colleagues, termed CapR . Able to operate at high-throughput, CapR uses energy calculations to determine the probability of secondary structure throughout an RNA molecule. When applied to CLIP data, the software can be used to consider the secondary structure preferences of various RBPs, by calculating probability values at CLIP-determined binding sites. For example, the human RBP Pumilio-2 is shown by CapR to have a preference for hairpin loop structures.
While Genome Biology’s RBPome issue focuses on RNA–protein actions in which proteins are thought to regulate RNAs, the opposite can also be true, such as when long non-coding RNAs (lncRNAs) act as scaffolds for complex assembly. For the most part, lncRNA regulation of proteins is thought to occur in the nucleus , where the cell’s lncRNA population is believed to be concentrated [43, 44], although ribosomal profiling studies have also observed ribosome-associated lncRNAs [45, 46].
Edwin Cuppen and colleagues take a new approach to studying lncRNA subcellular localization by sequencing transcriptome samples subjected to ribosomal fractionation (which yields separate nuclear, cytosolic, monosomal and polysomal fractions) . The results of this experiment suggest that, although a small number of lncRNAs are indeed enriched in the nucleus, the majority are not - contrary to expectations. Instead, many lncRNAs appear to reside outside of the nucleus, including a large number that are associated with the proteins that make up ribosomes (both in monosomes and polysomes).
The finding that lncRNA association with ribosomes is commonplace begs the question: what purpose do untranslatable RNAs have in binding to the translation machinery? One possibility is that lncRNAs regulate ribosomes in some way, just as they have been shown to do for the various other cellular components that enact Crick’s Central Dogma.
Scaling the heights of the RBPome would have been an impossible task without the contributions of many scientists from the genomics and wider research communities. The editors of Genome Biology are very grateful to everyone who supported this project by submitting a manuscript and to the many uncredited referees who were very generous with their time. Most of all, we are grateful for the invaluable and very significant assistance provided by our hands-on Guest Editors, Jernej Ule and John Rinn, who span both the Atlantic and the broad scope of the RBPome field. Please do read the Guest Editors’ Editorial , in which their thoughts on the past, present and future of the RBPome field are conveyed.
We hope that, together, the authors, reviewers and Guest Editors of the issue, with a little help from Genome Biology, have put the brains, the brawn and the beauty of the RBPome firmly on the genomics map.
Crosslinking and immunoprecipitation
High-throughput sequencing CLIP
Heterogeneous nuclear ribonucleoproteins
Individual-nucleotide resolution CLIP
Long non-coding RNA
Quantitative polymerase chain reaction
High-throughput sequencing RIP.