- Open Access
Patterns of ribosomal protein expression specify normal and malignant human cells
Genome Biologyvolume 17, Article number: 236 (2016)
Ribosomes are highly conserved molecular machines whose core composition has traditionally been regarded as invariant. However, recent studies have reported intriguing differences in the expression of some ribosomal proteins (RPs) across tissues and highly specific effects on the translation of individual mRNAs.
To determine whether RPs are more generally linked to cell identity, we analyze the heterogeneity of RP expression in a large set of human tissues, primary cells, and tumors. We find that about a quarter of human RPs exhibit tissue-specific expression and that primary hematopoietic cells display the most complex patterns of RP expression, likely shaped by context-restricted transcriptional regulators. Strikingly, we uncover patterns of dysregulated expression of individual RPs across cancer types that arise through copy number variations and are predictive for disease progression.
Our study reveals an unanticipated plasticity of RP expression across normal and malignant human cell types and provides a foundation for future characterization of cellular behaviors that are orchestrated by specific RPs.
Protein synthesis is at the core of cellular life. It is carried out by the ribosome, a highly conserved molecular machine with the same basic architecture in all free-living organisms [1–3]. In humans, the ribosome is composed of four ribosomal RNAs (rRNAs) and 80 ribosomal proteins (RPs), and its structure is believed to be largely invariant . However, recent studies have started to uncover some degree of variability in ribosomal components, such as at the level of rRNA modifications and RP expression. These have also been linked to both ribosomal function and the physiological state of cells (reviewed in [5, 6]).
Variability in ribosomal components could lead to a vast number of ribosome variants. Alternatively, the different components may have extra-ribosomal functions, as some ribosomal proteins do. Since synthesis of translational machinery components represents a large part of the energetic cost of cellular life, the abundance of ribosomal proteins is expected to be under tight control. Indeed, many feedback mechanisms have been discovered that link the production of different ribosomal components to maintain an appropriate stoichiometry ; a number of RPs act in negative feedback loops to control their own expression as well as the expression of other RPs, either at the level of splicing [8, 9] or at the level of mRNA decay . In bacteria, RPs frequently regulate, in negative feedback, the translation of entire RP operons [11, 12]. In eukaryotes, imbalanced RP levels frequently engage the p53 pathway [13–15] to cause cell cycle arrest and apoptosis. Aside from the feedback on ribosome biogenesis, perturbed expression of distinct RPs elicits a broad spectrum of phenotypes, from developmental defects to diseases [5, 6].
Interestingly, analysis of mRNA abundance revealed considerable differences in RP expression across human tissues , in mouse development , and in cancers [18–24]. However, the functional significance of such variation, as well as the underlying RP-dependent regulatory mechanisms, has remained insufficiently studied. Insights into the physiological roles of specific RPs have mostly come from naturally occurring phenotypes or diseases associated with RP loss of function. For instance, Rpl38, a component of the large ribosomal subunit, is essential for the appropriate axial skeleton formation in mouse embryonic development . The 5′ untranslated regions (5′UTRs) of Hox mRNAs contain sequence elements that prevent translation of the corresponding transcripts unless the Rpl38 protein enables the recognition of their internal ribosome entry site (IRES)-like elements . Furthermore, a striking number of human hematological disorders such as Diamond-Blackfan anemia (DBA) , T-cell acute lymphoblastic leukemia , and the 5q- syndrome  have been linked to mutations or chromosomal deletions which cause RP deficiencies. The consequences can be remarkably circumscribed, as in the case of RPS19, an RP frequently mutated in DBA patients, whose haploinsufficiency leads to reduced translation of GATA1 mRNA  and subsequent defects in erythrocyte maturation.
Cancer cells have a remarkable ability to evade anti-tumorigenic signals that control normal tissue architecture and progress into a chronic proliferation program . The high demand for protein synthesis in rapidly dividing malignant cells leads to increased ribosome biogenesis . Surprisingly, however, dysregulation of specific RPs has been observed in both cancer cell lines and patient samples [18–24]. The roles of individual RPs seem rather difficult to predict. For example, by binding to the 5′UTR of p53 mRNA and enhancing its translation, RPL26 can trigger programmed cell death . The ectopically overexpressed RPL36A has an entirely different function, localizing to nucleoli and increasing colony formation and cell growth of hepatocellular carcinoma lines, presumably through a more rapid cell cycling program . Thus, some RPs can act as tumor suppressors, whereas others promote tumorigenesis.
Despite the increasing body of evidence that individual RPs have cell-type-specific functions, a comprehensive study of RP expression heterogeneity across human cells has not been carried out so far. Furthermore, the factors that drive differential RP expression in distinct cellular contexts remain largely unknown. Through a comprehensive analysis of human RP expression pattern across 28 tissues, more than 300 primary cells, and 16 tumor types, we here estimate that about a quarter of RP genes exhibit tissue-specific expression. We find a particularly high RP expression heterogeneity in the hematopoietic system, where a small number of RP genes unequivocally discriminate cells of distinct lineages and developmental stages. Our analysis of transcription regulatory elements located in the promoters of RP genes indicates that key hematopoietic transcription factors could orchestrate the observed patterns of RP expression. Strikingly, we uncover a consistent dysregulated expression of individual RPs across cancers, which can be partially explained by copy number alterations. Our analysis suggests prominent roles of specific RPs in health and disease.
RP genes are differentially expressed across tissues
We used the promoter-level expression atlas generated by the FANTOM Consortium  to evaluate the expression of 90 distinct RP genes, including 19 paralogs, across adult human tissues (Fig. 1). Although the mRNA levels of RPs within a tissue spanned a wide range, two (brain) to three (blood) orders of magnitude (Additional file 1: Figure S1a), they were highly correlated between tissues (median Pearson correlation coefficient R = 0.96), consistent with a strongly conserved stoichiometry of ribosomal components across tissues. The total RP expression was strongly correlated with the proliferation index of the tissue (R = 0.81, P < 0.0001, Additional file 1: Figure S1b), indicating that the proliferative state of cells explains most of the variability in global RP expression across tissues.
Once the global tissue-dependent effect on expression levels was removed, the standard score-normalized expression of each RP was very consistent across tissues, with some notable exceptions (Fig. 2a). Tissue-specific patterns of RP expression are also evident at the protein level (Additional file 1: Figure S1c–f). For example, the expression of RPL3L, a paralog of the canonical ribosomal protein RPL3, is markedly higher in skeletal muscle — where it was found to regulate growth  — compared to other tissues. Similarly, the testis-specific RPL39L has been shown to be present in the ribosomes isolated from testis but not from other rodent tissues . The conservation of these expression patterns across different vertebrates further attests their physiological relevance (Fig. 3).
To systematically evaluate the tissue specificity of RP expression, we computed ”specificity scores,” defined as the deviation of RP expression levels in each particular tissue from the average across tissues (Fig. 1). Positive and negative scores indicate a higher and lower than expected expression level in a specific tissue, respectively. At a specificity score threshold corresponding to 2.5 standard deviations (SDs), 24 of the 90 (~27%) human RP genes exhibit tissue specificity of expression (Fig. 2b, Additional file 2: Table S1). Importantly, specificity scores for the different human tissues can be reproduced using a different tissue expression atlas (Additional file 1: Figure S1g). We identified both relatively up- (~66%) and down-regulated (~33%) RP genes. Interestingly, paralogs were particularly enriched among RP genes with tissue-dependent expression (P < 0.05, Fisher’s exact test), as may be expected if they underwent functional specialization. RP paralogs are highly similar but generally not identical to their canonical counterparts (Additional file 1: Figure S1h). Although only about a quarter of all RPs showed evidence of tissue-specific expression in human, for about half of the investigated tissues the specificity scores of all RPs were significantly correlated between different vertebrates (Additional file 1: Figure S2).
Extensive RP expression heterogeneity in the hematopoietic system
Profiling gene expression at the level of tissues, which are complex environments populated by a myriad of cell types, likely obscures differences between distinct cell types. Therefore, we have further analyzed expression of RP genes in more than 300 samples from human primary cells originating from all the three germ layers . Principal component analysis (PCA) revealed two main clusters (Fig. 4a), one very broad and almost entirely composed of hematopoietic cell types, and the other more compact, containing all remaining cell types. Distinct primary hematopoietic cells were also clearly distinguishable in the PCA (Fig. 4b): lymphoid cell types (natural killer, T, and B cells) formed a compact cluster, whereas myeloid cell types were more scattered. Samples from hematopoietic precursors also clustered together and away from differentiated cells. Finally, pooled blood samples were highly similar and clearly distinguishable from individual cell types, underscoring the importance of analyzing primary cells to grasp the degree of RP expression heterogeneity across elementary cell types.
Aggregating hematopoietic samples by cell type and performing PCA of the specificity scores of each RP in each cell type relative to hematopoietic stem cells revealed two distinct clusters corresponding to the lymphoid and myeloid lineages (Fig. 4c). Hierarchical clustering showed that multiple RPs appear to be coordinately regulated in subgroups of related cell types (Fig. 4d). For instance, RPS29 and RPS27L exhibit antagonistic expression patterns in lymphoid and myeloid cell types, whereas RPL36A and RPS3A are more specific for mature erythrocytes. Interestingly, precisely the opposite pattern is observed when comparing RPS29 and RPS27L expression in cell-line models of lymphoma/lymphoid leukemia, which represent intermediate developmental stages, and mature lymphocytes (Additional file 1: Figure S3). This strengthens the hypothesis of a connection between RP expression regulation and the hematopoietic developmental program. Similarly, RP genes with myeloid-lineage specificity in our analysis, such as RPS27L , RPS15 , and RPS24 , have been previously implicated in bone marrow deficiencies.
Promoters of hematopoietic lineage-specific RPs display distinct regulatory signatures
The intriguing observation that RPs exhibit tissue-specific and cell-type-specific variations in expression raises the question of what transcriptional regulators are responsible for these changes. Recent work revealed the complexity of transcriptional regulation of human genes, most of which have multiple promoters that are selectively activated in a tissue-specific manner . The promoter-level expression atlas allowed us to examine the usage of individual RP gene promoters (1 to 16 per RP, with a median of 3) across tissues and primary hematopoietic cells. Surprisingly, although cases of single and multiple active promoters for a given RP are represented in the data (Additional file 1: Figure S4a,b,c), the tissue and hematopoietic cell-type-specific RP genes do not show an increased usage of alternative promoters with respect to non-specific RPs (Additional file 1: Figure S4d,e,f).
We next examined the predicted binding sites of 495 transcription factors (TFs)  and found a surprising degree of heterogeneity in the number of RP promoters predicted to be targeted by individual TFs (Additional file 1: Figure S5a). This suggests that TFs may provide context-specific regulation of RP genes. To test this hypothesis, we first determined the TFs that are most specifically expressed in the different hematopoietic lineages. As expected, previously described lineage-specific TFs showed a strong expression bias toward the corresponding cell types: STAT4 for the lymphoid  (Fig. 5a) and GATA1 for the erythroid  (Fig. 5b) lineages. Other well-known cell-type-specific TFs such as CREB1, which is active in lymphoid cells [41, 42], did not show particularly strong expression bias (Fig. 5c), possibly because it is its activity, rather than its expression level, that changes between cell types. Indeed, if we estimate the activity of TFs by modeling the expression level of their targets , we can identify TFs whose activity changes in a cell-type-specific manner (Fig. 5d–f), even when the corresponding expression levels remain invariant (Fig. 5c and f).
We then asked whether binding motifs of TFs with lineage-specific activities are enriched in the promoters of RP genes that exhibit a matching lineage specificity of expression. Remarkably, as shown for three RP genes with the largest variance in specificity scores across developmental lineages: the lymphoid-specific RPS29, myeloid-specific RPS27L, and erythrocyte-specific RPS3A (Fig. 4d), their promoters are predicted to be preferentially targeted by TFs with a matching lineage-specific activity (Fig. 5g,h and Additional file 1: Figure S5b). Although data pertaining to the direct interaction of TFs with RP promoters in individual hematopoietic lineages are lacking, chromatin immunoprecipitation studies carried out by the Encyclopedia of DNA Elements (ENCODE) consortium  confirm that the TF-RP promoter interactions that we predicted here do occur at least in some cell lines in which these TFs are expressed (Fig. 5i). Generalizing the analysis to comprehensive sets of RPs that exhibit lymphoid, myeloid, or erythroid specificity of expression leads to a similar result (Additional file 1: Figure S5c–n). Altogether, our analysis indicates that cell-type-specific patterns of RP expression could be explained by the binding of multiple TFs with lineage-restricted activity to individual RP promoters.
Dysregulated expression of specific RP genes in cancers
We next used the The Cancer Genome Atlas (TCGA) data to compare RP expression levels in matched normal and malignant tissue samples. Notably, we observed a median increase of ~30% in the median RP expression levels of tumors as compared to the corresponding normal tissue samples (Additional file 1: Figure S6a). Some cancers, such as bladder and breast carcinomas, did not exhibit an overall increase in RP expression, a possible reason being that the cancer induces alterations into the adjacent tissue, which still looks ”normal” at the histological level. We then estimated dysregulation scores for each individual RP by comparing the expression levels in matched normal and tumor tissues and found an intriguing consistency across cancers (Fig. 6a). The variation in RP expression could be partially explained by copy number variation (Fig. 6b–d), indicating that the dysregulation of RP genes in cancer is strongly driven by genomic alterations. Several RP genes, some of which had been previously associated with p53 activation , consistently exhibited negative dysregulation scores across cancers (Fig. 6a, blue line). These RPs probably act as tumor suppressors. In the gene cluster exhibiting a consistent positive dysregulation score in distinct cancer types (Fig. 6a, red line) we identified proteins that have been previously associated with increased proliferation phenotypes, such as RPL36A  and RPS2 . Remarkably, the average RP dysregulation scores estimated across cancers were significantly anti-correlated with the enrichment of the corresponding single guide RNA (sgRNA) in a CRISPR screening for cell viability  carried out in a melanoma cell line (Additional file 1: Figure S6b). This provides independent evidence for the role of specific RPs in the viability of cancer cells. RPL39L, an RP gene paralog with testis-specific expression (Fig. 2a), which was previously found up-regulated in hepatocellular carcinoma and in some cancer cell lines [24, 46], had the most striking bimodal pattern of expression across carcinoma types. RPL39L is also consistently up-regulated in several cell-line models of breast and lung carcinoma, indicating that these cell lines could be used to study the function of this protein in tumorigenesis (Additional file 1: Figure S6c).
Our systematic analysis further revealed that numerous RPs exhibited strong dysregulation only in particular cancer types (Fig. 6a). Among these, RPL26L1 and RPS27L were exclusively up-regulated in breast and thyroid carcinomas, respectively, whereas RPL21 had decreased expression in breast and uterine cancers. Strikingly, some RP gene knockouts, including RPL21, have been positively selected in a CRISPR-based viability screen carried out in a melanoma cancer cell line (Additional file 1: Figure S6b) . This indicates that RP gene loss is not always detrimental for cellular fitness. To further evaluate the relevance of RP expression dysregulation, we examined the relationship between patients’ relapse-free survival and the expression levels of the RP genes that we inferred to be relevant for breast carcinoma. We found that high levels of RPL39L and RPL26L1 and a low level of RPL21 are associated with significantly lower patient relapse-free survival rates (Fig. 6e), whereas the expression levels of other RPs, not found to be dysregulated in patients, showed a much milder or insignificant association with prognosis (Fig. 6f). A classifier combining the expression signature of RPL39L, RPL26L1, and RPL21 has an even greater predictive power, comparable to that of genes that are most predictive  for the relapse-free survival of breast cancer patients (Fig. 6g). Our data highlight the potential of using RP expression signatures for predicting disease progression.
In view of the strong selection pressure for translation accuracy, the ribosome has largely been viewed as an invariant, highly optimized molecular machine. Its components exhibit relatively little expression variability in populations of identical cells . Therefore, the finding that a myriad of phenotypes can emerge from modulating expression of ribosomal components in Saccharomyces cerevisiae came as a surprise . In support of the functional specialization of different RPs, gene expression analyses have reported the differential expression of RPs at the organ level both in humans and mice [16, 17]. Additionally, a few studies have started to reveal an association between individual RPs and tumorigenic phenotypes [18–24]. However, the full extent of RP expression heterogeneity can only be appreciated by systematically analyzing individual cell types and distinct biological contexts. Here we undertook this characterization in a large set of human tissues, primary cells, and cancer samples (Fig. 1).
As hinted in previous reports [16, 50], we found that human tissues display a notable level of RP expression heterogeneity (Fig. 2), with around one quarter of all RP genes exhibiting tissue-specific expression. Attesting for their functional relevance, these expression patterns are also conserved between vertebrate species that are hundreds of million years apart (Fig. 3). As may be expected, RPs exhibiting tissue-specific expression tend to be paralogs of canonical RPs, to which they are highly similar in sequence. Although it is tempting to speculate that these duplicated genes were co-opted for extra-ribosomal functions in the tissues in which they are expressed, at least in some cases these RP paralogs are found in ribosomes , strongly suggesting that they actively engage in translating complexes. An intriguing possibility is that up-regulated non-canonical RP gene paralogs compete with their canonical counterparts during ribosome biogenesis to yield ”specialized ribosomes” that enable selective translation of mRNAs [5, 51].
Profiling gene expression in complex tissues/organs has likely masked differences between individual cell types. Indeed, analyzing the data from more than 300 human primary cells (Fig. 4), we found a high degree of RP expression heterogeneity in hematopoietic cells, where a small subset of RPs can discriminate cell types belonging to different hematopoietic lineages. Consistently with our results, ribosomopathies, which are disorders resulting from defective ribosome function, often lead to bone marrow failure phenotypes (reviewed in ). In remarkable agreement with our predictions, the depletion of some of the RPs found here to exhibit lineage-biased expression, such as RPS15 , RPS24 , and RPS27L , has been associated with malignancy and abnormalities in the maturation process of the respective cell types. Nonetheless, some of the RPs whose disruption was previously linked to erythroid differentiation defects, such as RPS19 in DBA  and RPS14 in 5q- syndrome , did not display any noticeable expression specificity in our analysis, at least in the mature cell types analyzed. The reason for this discrepancy is unclear, one possibility being that these proteins exert their specific functions at stages of erythropoietic development that were not captured in the promoter atlas. Nevertheless, our analysis strongly suggests that some RPs are involved in the specification of hematopoietic lineages through mechanisms that largely remain to be uncovered. A first hint was offered by a recent study which showed that haploinsufficiency of RPS19 leads to specific translation defects in the GATA1 mRNA , a master transcriptional regulator of erythropoiesis. Understanding the basis of this strong translation selectivity remains a challenge. Crosslinking and immunoprecipitation  of individual RPs may enable the discovery of their specific targets and the inference of interaction determinants.
An outstanding question still is what upstream regulators modulate the expression of RPs in different cell types. We found little evidence for differential promoter utilization in the modulation of RP gene expression across different tissues and cell types. Rather, we observed that the promoters of RPs that are differentially expressed in primary hematopoietic cells display unique regulatory fingerprints (Fig. 5). Interestingly, a recent study has shown that GATA1 TF binds to the promoter of RPS19 in primary human erythroid cells , which in turn has been shown to be essential for the efficient translation of the GATA1 mRNA . Our results thus suggest that key transcriptional regulators orchestrate the production of cell-type-specific transcripts, including those encoding ribosomal proteins.
We observed intriguing patterns of RP expression in cancers. Several RP genes, some of which had been previously associated with p53 activation, consistently exhibited negative dysregulation scores across cancers (Fig. 6). These RPs may thereby act, directly or indirectly, as tumor suppressors. In some cases, the underlying molecular mechanisms have been described. For example, RPL5  and RPL11  have the capacity to translocate to the nucleoplasm, where they bind to H/MDM2, preventing p53 ubiquitination and degradation, whereas RPL26 enhances the translation efficiency of the p53 mRNA . In the cluster of RPs that have positive dysregulation scores across distinct cancers we identified several proteins, some of which had been previously associated with increased proliferation phenotypes, such as RPL36A  and RPS2 . Surprisingly, we found that RPL39L, an RP gene paralog exhibiting testis-specific expression (Fig. 2), is consistently up-regulated in several carcinomas. The function of RPL39L, which differs from its paralog by only four amino acids, remains to be characterized. Although it is not entirely clear why increased expression of these particular RPs would confer a selective advantage to malignant cells, one could speculate that they are directly involved in selective activation of oncogenes  or inhibition of tumor suppressors . Most unexpected was that a number of RPs exhibited strong dysregulation only in particular cancers, which suggests idiosyncratic responses of RPs to specific cellular microenvironments. For instance, although RPL26 was reported to bind the 5′UTR of p53 mRNA to enhance its translation in a breast cancer cell line , we here find that its paralog, RPL26L1, is specifically up-regulated in breast carcinoma. This suggests the possibility that RPL26L1 competes with RPL26 and counteracts its effect on p53 mRNA translation, thereby providing a permissive environment for cellular transformation. A similar behavior has been observed in zebrafish embryos, where the RP paralogs Rpl22 and Rpl22l1 exert antagonistic effects on smad1 mRNA translation, thereby having divergent functions in hematopoietic development . In summary, we identified multiple “signatures” of dysregulated RPs in a variety of cancers. These signatures, which seem to originate from copy number variations, include not only over-expressed but also down-regulated RPs and, unexpectedly, have a significant prognostic value.
The control of RP abundances in eukaryotes extends across multiple regulatory layers including transcription , direct and indirect splicing-dependent mechanisms [59, 60], translation [61, 62], as well as protein turnover . We have inferred the context-specific expression of RP genes mostly from measurements of mRNA abundance that are more readily available. The need to validate the propagation of these expression patterns to protein level remains, and one hopes it will be addressed by future proteomic studies. It is also important to recognize that other post-transcriptional regulatory layers may contribute to context-specific expression of RPs. For example, RP-encoding mRNAs contain a remarkably diverse set of translation-regulatory signals  that are capable of tuning protein abundances. These include specific codon and amino acid usage of individual RPs  which are known to influence translation rates both in prokaryotes  and eukaryotes .
The striking patterns of ribosomal protein expression across cellular contexts reported here highlight the role of individual RPs. As new high-throughput datasets become available and the molecular mechanisms of specific RPs start to be revealed, our understanding of RP-mediated functional specialization becomes clear. So far at least three possible distinct mechanisms seem to be involved: (1) global change in protein synthesis rate [68, 69]; (2) modulation of translation rates of specific mRNAs by ribosomal proteins, independently or as part of the ribosomal complex [17, 21, 29, 70, 71]; and (3) other extra-ribosomal functions of specific ribosomal proteins (reviewed in ). Nevertheless, predicting the phenotypes of perturbed RP expression remains very challenging, which suggests that an exciting branch of cellular biology still remains to be discovered.
Our study reveals an unanticipated plasticity of RP expression across normal and malignant human cells. We found that RPs can help in the identification of cell types and that cancer cells exhibit complex, yet reproducible patterns of RP expression, which could serve as prognostic or diagnostic markers. As evidence for cell type specificity of expression of individual RPs is accumulating, our study provides a foundation for characterizing cellular behaviors that are orchestrated by specific RPs. Additionally, because the regulatory signals that tune RP abundances in individual cell types are not well defined, we anticipate that our study will provide entry points into such studies in the coming years. Ultimately, we hope that a deeper understanding of regulatory mechanisms that are dependent on specific RPs will open up new therapeutic opportunities.
Gene expression datasets
Promoter expression data (normalized relative log expression) for human tissues, primary cells, and cell lines were obtained from the FANTOM5 project . Promoter expression levels were aggregated to yield gene-level expression estimates. To further confirm the consistency of tissue-specific RP gene expression, we used an additional gene expression dataset from the Human Protein Atlas .
Gene expression estimates for matched normal and malignant tissue samples were retrieved from The Cancer Genome Atlas (RNASeqV2 RSEM, normalized expression). mRNA expression levels for different cancer cell lines were retrieved from the Cancer Cell Line Encyclopedia (RMA, normalized expression) .
RP gene expression data for five different vertebrates (four mammals and one bird) were obtained from .
All gene expression datasets were log2-transformed after the addition of a pseudo-count.
Protein expression dataset
Protein abundance data were retrieved from the Human Protein Atlas , which provides qualitative antibody-based measurements of protein expression across 83 different cell types belonging to 44 tissues. Briefly, for each gene, tissue microarray (TMA) immunohistochemical staining was performed using single as well as multiple antibodies for the same protein so as to derive accurate estimates of protein expression. The stained TMA slides were then scored (not detected, low, medium, or high expression) with respect to the intensity of immunoreactivity, the fraction of immunostained cells, and cellular localization of immunoreactivity. The protein expression of RPL39L and RPL3L was probed using 2 and 1 antibodies, respectively. When reporting qualitative protein levels in tissues for which multiple cell types were available, the one displaying higher abundance was selected as done in the Human Protein Atlas . For additional confirmation using an independent dataset, we also retrieved from  the protein expression levels that were measured by mass spectrometry from a handful of tissues.
For each tissue, we summed the expression levels (transcripts per million) of over 300 genes previously reported to define a cellular proliferation signature , which we then normalize between 0 and 1 using the minimum and maximum sums registered across tissues. Importantly, there are only two genes in common between this set of genes and RP genes.
Following the observation that RP expression levels are strongly correlated between pairs of tissues, we calculated the normalized deviation from the fitted linear relation between the RP expression levels in each tissue and the averaged RP expression levels across all tissues (i.e., the standardized residual). We called this the ”specificity score.” These scores reflect the number of standard deviations (SDs) away from the mean deviation across all RPs. To estimate RP specificity scores in the different hematopoietic cell types, we used the same approach but compared the expression level of RPs in a particular cell type with that in hematopoietic stem cells. We considered that an RP has cell-type-/tissue-specific expression if |specificity score| > 2.5 SD, which, given the normal distribution of the standardized residuals, corresponds on average to the region outside the 97.9 percentiles.
Relative promoter usage across samples
For each RP gene, we first normalized the expression level of each promoter by the total expression level of all promoters in a given sample (tissue or primary cell), and then computed the mean usage of each promoter across all samples.
Analysis of transcription factor activity in hematopoietic cell types
We used ISMARA  to estimate the activity of 495 different transcription factors in the different hematopoietic cell types. In brief, ISMARA models the expression levels of all mRNAs in terms of the TF binding sites present in the respective gene promoters and the activity of all TFs. The explanatory power of any given TF in a sample is then summarized by its activity z-score. Positive and negative z-scores indicate that genes targeted by that TF are up-regulated and down-regulated in the associated cell type, respectively.
Analysis of transcription regulatory elements in human RP gene promoters
We obtained the regulatory maps for all human promoters from . Briefly, these maps were constructed by evaluating posterior probabilities of motifs representing the sequence specificity of transcription factors, taking into account the pattern of evolutionary conservation in promoter regions. The data were summarized in a matrix whose elements were the scores of the binding sites for each TF motif (columns) in each promoter region (rows). For any given RP gene, we only considered the regulatory motifs predicted in the promoter showing the highest evidence of expression.
Transcription factor ChIP analysis
Genomic TF binding sites and respective scores were retrieved from the UCSC Genome Browser, track “Txn Factor ChIP.” TF binding sites were assigned to RP gene promoters if they were localized within 5 kb of the respective transcript start site.
We included in our analysis all cancer types for which matched normal and tumor solid tissue samples were available for at least five patients. For each patient, we computed RP dysregulation scores as the standardized residuals estimated from the linear fit between the RP expression levels in the tumor and normal tissue samples (similar to the specificity score defined above). For each cancer type, we then calculated the median dysregulation score for each RP across all patients. These scores were finally standard score-normalized over all RP genes for each cancer so as to enable their comparison across cancer types. We considered an RP to be dysregulated in a particular cancer if |dysregulation score| > 2.5 SD.
Copy number variation analysis
Putative copy number alterations estimated using GISTIC 2.0  for all RP genes across different cancers were retrieved from the cBioPortal [79, 80]. Deep and shallow deletions were both classified as deletions, whereas gain and amplifications were grouped as gain alterations. We then calculated the relative frequency of deletion, gain, or no alterations for each RP gene in any given cancer type, and we averaged these numbers across all cancers to yield global estimates of copy number alterations.
Breast cancer survival analysis
We used the Kaplan-Meier Plotter to calculate the relapse-free survival rates of 3557 patients with breast cancer. Tumor samples were split into “low” and “high” levels with respect to the median gene expression of the RP (or group of RPs) across the cohort.
Protein sequence alignment
Sequences of ribosomal protein paralogs were aligned using the Needleman-Wunsch global alignment algorithm.
Reported correlation coefficients are Pearson product-moment correlation coefficients. Statistical tests comparing distributions were performed using the non-parametric Mann-Whitney U test.
Clustering was performed using Euclidean distance and Ward’s minimum variance method.
The following websites are used for reference:
Kaplan-Meier Plotter, http://kmplot.com
The Cancer Cell Line Encyclopedia, https://portals.broadinstitute.org/ccle/home
The Human Protein Atlas, http://www.proteinatlas.org
UCSC Genome Browser, https://genome.ucsc.edu
Klinge S, Voigts-Hoffmann F, Leibundgut M, Ban N. Atomic structures of the eukaryotic ribosome. Trends Biochem Sci. 2012;37:189–98.
Ramakrishnan V. The ribosome emerges from a black box. Cell. 2014;159:979–84.
Ban N, Beckmann R, Cate JH, Dinman JD, Dragon F, Ellis SR, Lafontaine DL, Lindahl L, Liljas A, Lipton JM, et al. A new system for naming ribosomal proteins. Curr Opin Struct Biol. 2014;24:165–9.
Khatter H, Myasnikov AG, Natchiar SK, Klaholz BP. Structure of the human 80S ribosome. Nature. 2015;520:640–5.
Xue S, Barna M. Specialized ribosomes: a new frontier in gene regulation and organismal biology. Nat Rev Mol Cell Biol. 2012;13:355–69.
Wang W, Nag S, Zhang X, Wang MH, Wang H, Zhou J, Zhang R. Ribosomal proteins and human diseases: pathogenesis, molecular mechanisms, and therapeutic implications. Med Res Rev. 2015;35:225–85.
Warner JR, McIntosh KB. How common are extraribosomal functions of ribosomal proteins? Mol Cell. 2009;34:3–11.
Malygin AA, Parakhnevitch NM, Ivanov AV, Eperon IC, Karpova GG. Human ribosomal protein S13 regulates expression of its own gene at the splicing step by a feedback mechanism. Nucleic Acids Res. 2007;35:6414–23.
Eng FJ, Warner JR. Structural basis for the regulation of splicing of a yeast messenger RNA. Cell. 1991;65:797–804.
O’Leary MN, Schreiber KH, Zhang Y, Duc AC, Rao S, Hale JS, Academia EC, Shah SR, Morton JF, Holstein CA, et al. The ribosomal protein Rpl22 controls ribosome composition by directly repressing expression of its own paralog, Rpl22l1. PLoS Genet. 2013;9:e1003708.
Nomura M. Regulation of ribosome biosynthesis in Escherichia coli and Saccharomyces cerevisiae: diversity and common principles. J Bacteriol. 1999;181:6857–64.
Li GW, Burkhardt D, Gross C, Weissman JS. Quantifying absolute protein synthesis rates reveals principles underlying allocation of cellular resources. Cell. 2014;157:624–35.
Marechal V, Elenbaas B, Piette J, Nicolas JC, Levine AJ. The ribosomal L5 protein is associated with mdm-2 and mdm-2-p53 complexes. Mol Cell Biol. 1994;14:7414–20.
Lohrum MA, Ludwig RL, Kubbutat MH, Hanlon M, Vousden KH. Regulation of HDM2 activity by the ribosomal protein L11. Cancer Cell. 2003;3:577–87.
Zhang Y, Wolf GW, Bhat K, Jin A, Allio T, Burkhart WA, Xiong Y. Ribosomal protein L11 negatively regulates oncoprotein MDM2 and mediates a p53-dependent ribosomal-stress checkpoint pathway. Mol Cell Biol. 2003;23:8902–12.
Bortoluzzi S, d’Alessi F, Romualdi C, Danieli GA. Differential expression of genes coding for ribosomal proteins in different human tissues. Bioinformatics. 2001;17:1152–7.
Kondrashov N, Pusic A, Stumpf CR, Shimizu K, Hsieh AC, Xue S, Ishijima J, Shiroishi T, Barna M. Ribosome-mediated specificity in Hox mRNA translation and vertebrate tissue patterning. Cell. 2011;145:383–97.
Pogue-Geile K, Geiser JR, Shu M, Miller C, Wool IG, Meisler AI, Pipas JM. Ribosomal protein genes are overexpressed in colorectal cancer: isolation of a cDNA clone encoding the human S3 ribosomal protein. Mol Cell Biol. 1991;11:3842–9.
Henry JL, Coggin DL, King CR. High-level expression of the ribosomal protein L19 in human breast tumors that overexpress erbB-2. Cancer Res. 1993;53:1403–8.
Kim JH, You KR, Kim IH, Cho BH, Kim CY, Kim DG. Over-expression of the ribosomal protein L36a gene is associated with cellular proliferation in hepatocellular carcinoma. Hepatology. 2004;39:129–38.
Takagi M, Absalon MJ, McLure KG, Kastan MB. Regulation of p53 translation and induction after DNA damage by ribosomal protein L26 and nucleolin. Cell. 2005;123:49–63.
Bee A, Ke Y, Forootan S, Lin K, Beesley C, Forrest SE, Foster CS. Ribosomal protein l19 is a prognostic marker for human prostate cancer. Clin Cancer Res. 2006;12:2061–5.
Guo X, Shi Y, Gou Y, Li J, Han S, Zhang Y, Huo J, Ning X, Sun L, Chen Y, et al. Human ribosomal protein S13 promotes gastric cancer growth through down-regulating p27(Kip1). J Cell Mol Med. 2011;15:296–306.
Wong QW, Li J, Ng SR, Lim SG, Yang H, Vardy LA. RPL39L is an example of a recently evolved ribosomal protein paralog that shows highly specific tissue expression patterns and is upregulated in ESCs and HCC tumors. RNA Biol. 2014;11:33–41.
Xue S, Tian S, Fujii K, Kladwang W, Das R, Barna M. RNA regulons in Hox 5′ UTRs confer ribosome specificity to gene regulation. Nature. 2015;517:33–8.
Draptchinskaia N, Gustavsson P, Andersson B, Pettersson M, Willig TN, Dianzani I, Ball S, Tchernia G, Klar J, Matsson H, et al. The gene encoding ribosomal protein S19 is mutated in Diamond-Blackfan anaemia. Nat Genet. 1999;21:169–75.
De Keersmaecker K, Atak ZK, Li N, Vicente C, Patchett S, Girardi T, Gianfelici V, Geerdens E, Clappier E, Porcu M, et al. Exome sequencing identifies mutation in CNOT3 and ribosomal genes RPL5 and RPL10 in T-cell acute lymphoblastic leukemia. Nat Genet. 2013;45:186–90.
Ebert BL, Pretz J, Bosco J, Chang CY, Tamayo P, Galili N, Raza A, Root DE, Attar E, Ellis SR, Golub TR. Identification of RPS14 as a 5q- syndrome gene by RNA interference screen. Nature. 2008;451:335–9.
Ludwig LS, Gazda HT, Eng JC, Eichhorn SW, Thiru P, Ghazvinian R, George TI, Gotlib JR, Beggs AH, Sieff CA, et al. Altered translation of GATA1 in Diamond-Blackfan anemia. Nat Med. 2014;20:748–53.
Bissell MJ, Hines WC. Why don’t we get more cancer? A proposed role of the microenvironment in restraining cancer progression. Nat Med. 2011;17:320–9.
Thomas G. An encore for ribosome biogenesis in the control of cell proliferation. Nat Cell Biol. 2000;2:E71–2.
The FANTOM. Consortium and the RIKEN PMI and CLST (DGT). A promoter-level mammalian expression atlas. Nature. 2014;507:462–70.
Chaillou T, Zhang X, McCarthy JJ. Expression of muscle-specific ribosomal protein L3-like impairs myotube growth. J Cell Physiol. 2016;231:1894–902.
Sugihara Y, Honda H, Iida T, Morinaga T, Hino S, Okajima T, Matsuda T, Nadano D. Proteomic analysis of rodent ribosomes revealed heterogeneity including ribosomal proteins L10-like, L22-like 1, and L39-like. J Proteome Res. 2010;9:1351–66.
Xiong X, Zhao Y, Tang F, Wei D, Thomas D, Wang X, Liu Y, Zheng P, Sun Y. Ribosomal protein S27-like is a physiological regulator of p53 that suppresses genomic instability and tumorigenesis. Elife. 2014;3:e02236.
Gazda HT, Sheen MR, Vlachos A, Choesmel V, O’Donohue MF, Schneider H, Darras N, Hasman C, Sieff CA, Newburger PE, et al. Ribosomal protein L5 and L11 mutations are associated with cleft palate and abnormal thumbs in Diamond-Blackfan anemia patients. Am J Hum Genet. 2008;83:769–80.
Gazda HT, Grabowska A, Merida-Long LB, Latawiec E, Schneider HE, Lipton JM, Vlachos A, Atsidaftos E, Ball SE, Orfali KA, et al. Ribosomal protein S24 gene is mutated in Diamond-Blackfan anemia. Am J Hum Genet. 2006;79:1110–8.
Balwierz PJ, Pachkov M, Arnold P, Gruber AJ, Zavolan M, van Nimwegen E. ISMARA: automated modeling of genomic signals as a democracy of regulatory motifs. Genome Res. 2014;24:869–84.
Bacon CM, Petricoin 3rd EF, Ortaldo JR, Rees RC, Larner AC, Johnston JA, O’Shea JJ. Interleukin 12 induces tyrosine phosphorylation and activation of STAT4 in human lymphocytes. Proc Natl Acad Sci U S A. 1995;92:7307–11.
Pevny L, Simon MC, Robertson E, Klein WH, Tsai SF, D’Agati V, Orkin SH, Costantini F. Erythroid differentiation in chimaeric mice blocked by a targeted mutation in the gene for transcription factor GATA-1. Nature. 1991;349:257–60.
Mayr B, Montminy M. Transcriptional regulation by the phosphorylation-dependent factor CREB. Nat Rev Mol Cell Biol. 2001;2:599–609.
Wen AY, Sakamoto KM, Miller LS. The role of the transcription factor CREB in immune function. J Immunol. 2010;185:6413–9.
Encode Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74.
Wang M, Hu Y, Stearns ME. RPS2: a novel therapeutic target in prostate cancer. J Exp Clin Cancer Res. 2009;28:6.
Shalem O, Sanjana NE, Hartenian E, Shi X, Scott DA, Mikkelsen TS, Heckl D, Ebert BL, Root DE, Doench JG, Zhang F. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science. 2014;343:84–7.
Nadano D, Notsu T, Matsuda T, Sato T. A human gene encoding a protein homologous to ribosomal protein L39 is normally expressed in the testis and derepressed in multiple cancer cells. Biochim Biophys Acta. 2002;1577:430–6.
Gyorffy B, Lanczky A, Eklund AC, Denkert C, Budczies J, Li Q, Szallasi Z. An online survival analysis tool to rapidly assess the effect of 22,277 genes on breast cancer prognosis using microarray data of 1,809 patients. Breast Cancer Res Treat. 2010;123:725–31.
Newman JR, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, DeRisi JL, Weissman JS. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature. 2006;441:840–6.
Komili S, Farny NG, Roth FP, Silver PA. Functional specificity among ribosomal proteins regulates gene expression. Cell. 2007;131:557–71.
Gupta V, Warner JR. Ribosome-omics of the human ribosome. RNA. 2014;20:1004–13.
Mauro VP, Edelman GM. The ribosome filter hypothesis. Proc Natl Acad Sci U S A. 2002;99:12031–6.
Narla A, Ebert BL. Ribosomopathies: human disorders of ribosome dysfunction. Blood. 2010;115:3196–205.
Hafner M, Landthaler M, Burger L, Khorshid M, Hausser J, Berninger P, Rothballer A, Ascano Jr M, Jungkamp AC, Munschauer M, et al. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell. 2010;141:129–41.
Amanatiadou EP, Papadopoulos GL, Strouboulis J, Vizirianakis IS. GATA1 and PU.1 bind to ribosomal protein genes in erythroid cells: implications for ribosomopathies. PLoS One. 2015;10:e0140077.
Ziemiecki A, Muller RG, Fu XC, Hynes NE, Kozma S. Oncogenic activation of the human trk proto-oncogene by recombination with the ribosomal large subunit protein L7a. EMBO J. 1990;9:191–6.
Naora H, Takai I, Adachi M, Naora H. Altered cellular responses by varying expression of a ribosomal protein gene: sequential coordination of enhancement and suppression of ribosomal protein S3a gene expression induces apoptosis. J Cell Biol. 1998;141:741–53.
Zhang Y, Duc AC, Rao S, Sun XL, Bilbee AN, Rhodes M, Li Q, Kappes DJ, Rhodes J, Wiest DL. Control of hematopoietic stem cell emergence by antagonistic functions of ribosomal protein paralogs. Dev Cell. 2013;24:411–25.
Zeevi D, Sharon E, Lotan-Pompan M, Lubling Y, Shipony Z, Raveh-Sadka T, Keren L, Levo M, Weinberger A, Segal E. Compensation for differences in gene copy number among yeast ribosomal proteins is encoded within their promoters. Genome Res. 2011;21:2114–28.
Li B, Vilardell J, Warner JR. An RNA structure involved in feedback regulation of splicing and of translation is critical for biological fitness. Proc Natl Acad Sci U S A. 1996;93:1596–600.
Parenteau J, Durand M, Morin G, Gagnon J, Lucier JF, Wellinger RJ, Chabot B, Elela SA. Introns within ribosomal protein genes regulate the production and function of yeast ribosomes. Cell. 2011;147:320–31.
Agrawal MG, Bowman LH. Transcriptional and translational regulation of ribosomal protein formation during mouse myoblast differentiation. J Biol Chem. 1987;262:4868–75.
Orom UA, Nielsen FC, Lund AH. MicroRNA-10a binds the 5′UTR of ribosomal protein mRNAs and enhances their translation. Mol Cell. 2008;30:460–71.
Tsay YF, Thompson JR, Rotenberg MO, Larkin JC, Woolford Jr JL. Ribosomal protein synthesis is not regulated at the translational level in Saccharomyces cerevisiae: balanced accumulation of ribosomal proteins L16 and rp59 is mediated by turnover of excess protein. Genes Dev. 1988;2:664–76.
Meyuhas O. Synthesis of the translational apparatus is regulated at the translational level. Eur J Biochem. 2000;267:6321–30.
Ishii K, Washio T, Uechi T, Yoshihama M, Kenmochi N, Tomita M. Characteristics and clustering of human ribosomal protein genes. BMC Genomics. 2006;7:37.
Guimaraes JC, Rocha M, Arkin AP. Transcript level and sequence determinants of protein abundance and noise in Escherichia coli. Nucleic Acids Res. 2014;42:4791–9.
Vogel C, Abreu Rde S, Ko D, Le SY, Shapiro BA, Burns SC, Sandhu D, Boutz DR, Marcotte EM, Penalva LO. Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol Syst Biol. 2010;6:400.
Oliver ER, Saunders TL, Tarle SA, Glaser T. Ribosomal protein L24 defect in belly spot and tail (Bst), a mouse Minute. Development. 2004;131:3907–20.
Lindstrom MS, Zhang Y. Ribosomal protein S9 is a novel B23/NPM-binding protein required for normal cell proliferation. J Biol Chem. 2008;283:15568–76.
Mazumder B, Sampath P, Seshadri V, Maitra RK, DiCorleto PE, Fox PL. Regulated release of L13a from the 60S ribosomal subunit as a mechanism of transcript-specific translational control. Cell. 2003;115:187–98.
Horos R, Ijspeert H, Pospisilova D, Sendtner R, Andrieu-Soler C, Taskesen E, Nieradka A, Cmejla R, Sendtner M, Touw IP, von Lindern M. Ribosomal deficiencies in Diamond-Blackfan anemia impair translation of transcripts essential for differentiation of murine and human erythroblasts. Blood. 2012;119:262–72.
Lizio M, Harshbarger J, Shimoji H, Severin J, Kasukawa T, Sahin S, Abugessaisa I, Fukuda S, Hori F, Ishikawa-Kato S, et al. Gateways to the FANTOM5 promoter level mammalian expression atlas. Genome Biol. 2015;16:22.
Uhlen M, Fagerberg L, Hallstrom BM, Lindskog C, Oksvold P, Mardinoglu A, Sivertsson A, Kampf C, Sjostedt E, Asplund A, et al. Proteomics tissue-based map of the human proteome. Science. 2015;347:1260419.
Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehar J, Kryukov GV, Sonkin D, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–7.
Merkin J, Russell C, Chen P, Burge CB. Evolutionary dynamics of gene and isoform regulation in Mammalian tissues. Science. 2012;338:1593–9.
Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS, Chaerkady R, Madugundu AK, Kelkar DS, Isserlin R, Jain S, et al. A draft map of the human proteome. Nature. 2014;509:575.
Sandberg R, Neilson JR, Sarma A, Sharp PA, Burge CB. Proliferating cells express mRNAs with shortened 3′ untranslated regions and fewer microRNA target sites. Science. 2008;320:1643–7.
Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 2011;12:R41.
Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, Sun Y, Jacobsen A, Sinha R, Larsson E, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6:pl1.
Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, Jacobsen A, Byrne CJ, Heuer ML, Larsson E, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2:401–4.
The authors thank Mikhail Pachkov and Daniel Schmocker for kindly providing the transcription factor binding site predictions for human gene promoters and for helping to perform the ISMARA analysis on hematopoietic samples. We would also like to thank Alexander Kanitz, Andrzej Rzepiela, Ana Correia, and Andrea Riba for critical feedback on the elaboration of this manuscript. The results shown here are in part based upon data generated by the ENCODE Consortium and the TCGA Research Network: http://cancergenome.nih.gov/.
This work was supported by a Swiss SystemsX.ch transition post-doctoral fellowship to JCG.
Availability of data and materials
Gene expression data for primary cells were retrieved from FANTOM5 , and expression estimates for different tissues were retrieved from FANTOM5  and the Human Protein Atlas . Protein levels for different tissues were collected from the Human Protein Atlas  and from . Gene expression estimates for normal and malignant samples were retrieved from The Cancer Genome Atlas (http://cancergenome.nih.gov), whereas copy number alteration estimates were obtained from the cBioPortal [79, 80]. Ribosomal protein expression in different cell lines was retrieved from FANTOM5  and the Cancer Cell Line Encyclopedia . Ribosomal protein expression data for different tissues of five vertebrates (four mammals and one bird) were obtained from . Relapse-free survival rates for patients with breast cancer were retrieved from the Kaplan-Meier Plotter (http://kmplot.com) . Genome-wide sets of transcription factor binding sites identified within the ENCODE project  were obtained using the UCSC Genome Browser (https://genome.ucsc.edu).
JCG and MZ conceived the study. JCG conducted the analyses. JCG and MZ wrote the manuscript. Both authors read and approved the final manuscript.
The authors declare that they have no competing financial interests.
Ethics approval and consent to participate
No ethics approval was required for this work.