In silico decomposition of the tumor-immune microenvironment
We quantified the relative tumor infiltration levels of 24 immune cell types by interrogating expression levels of genes in published signature gene lists [14]. The signatures we used comprised a diverse set of adaptive and innate immune cell types and contained 509 genes in total (Additional file 2: Table S1). Of these genes, 98.4% (501) were used uniquely in only one signature (Additional file 1: Figure S2). Due to the interconnectedness between immune cell infiltration and the antigen presenting machinery (APM), we also defined a seven-gene APM signature that consisted of MHC class I genes (HLA-A/B/C, B2M) and genes involved in processing and loading antigens (TAP1, TAP2, and TAPBP). Messenger RNA (mRNA)-based scores for these signatures were computed separately for each sample using ssGSEA [30]. ssGSEA measures the per sample overexpression level of a particular gene list by comparing the ranks of the genes in the gene list with those of all other genes.
We employed this approach to computationally assess the infiltration levels of immune cell types and APM gene expression levels in 7567 tumor and 633 normal samples from 19 different cancer types profiled by TCGA (Additional file 2: Table S2). To achieve a more focused view of the immune infiltration landscape in human cancers, we defined two aggregate scores: (1) the overall immune infiltration score (IIS) from both adaptive and innate immune cell scores; and (2) the T cell infiltration score (TIS) from nine T cell scores (CD8+ T, Th1, Th2, Th17, Treg, T effector memory, T central memory, T helper, and T cells) (see “Methods”). We computed the TIS and IIS of each sample in the study as the sum of the relevant individual scores.
Validation of the immune cell scoring methodology
Immune cell gene signatures were established by Bindea et al. [14] using three gene expression datasets [31–33] generated from sorted immune cell populations. Before validating these signatures on independent datasets, we first sought to confirm their discriminatory power on the datasets used to establish them and asked whether the expression of these genes separated immune cell populations into groups that were consistent with hematopoietic lineages. To this end, we obtained the microarray expression values for these genes, normalized with GCRMA [34] and corrected for batch effects using ComBat [35] (Additional file 1: Figure S3, see “Methods”). We then computed the principal components (PCs) of the batch-effect corrected dataset as a linear combination of the sorted immune cell types. This PC analysis successfully separated the cells into groups consistent with their hematopoietic lineage, suggesting adequate discrimination power for the signature genes (Additional file 1: Figure S4). More specifically, PC1 and PC2 achieved the separation of the following four groups: (1) macrophages and dendritic cells (DC); (2) B cells, NK cells (CD56dim and CD56 bright), CD8+, and CD4+ T cells; (3) Th1, Th2, T gamma delta, and T follicular helper cells; (4) mast cells, neutrophils, and eosinophils. The separation between CD8+ and CD4+ T cells was greatly enhanced if batch effect correction and PC analysis were performed with only the signatures genes of sorted T cell subpopulations (Additional file 1: Figure S5, see “Methods”).
Next, we validated the gene signatures and the ssGSEA methodology in a series of in vitro and in silico tests. The first test involved sorting immune cell populations with fluorescence activated cell sorting (FACS) and generating RNA-Seq gene expression profiles of the sorted populations. To this end, we obtained ccRCC patient specimens and sorted prevalent tumor-infiltrating immune populations such as CD8+ T cells (n = 5), NK CD16+ cells (n = 2), CD4+ T cells (n = 3), and macrophages (n = 4) as well as non-immune CD45– cells (n = 1). We then generated ssGSEA scores for all sorted samples using Bindea et al. signatures (Additional file 2: Table S3) and observed that each signature (CD8+ T cell, NK CD56dim cell, T helper cell, and macrophage signature, respectively) was able to identify the corresponding sorted population as being significantly higher than the other sorted populations (Fig. 1a) (Note that NK CD16+ cells are equivalent to NK CD56dim cells). Expectedly, the magnitude of the difference between the first and second highest immune population varied as a function of the phenotypic difference between the two populations. For instance, CD8+ T cells were most similar to NK CD16+ cells, another immune population with cytotoxic properties. Nevertheless, the first three PCs of ssGSEA scores were able to distinguish all tumor-associated immune populations as distinct clusters (Fig. 1b, Additional file 2: Table S3).
The second in vitro validation test involved comparing mRNA-based ssGSEA scores with levels of immunofluorescence(IF)-stained immune cells from 10 MSKCC primary ccRCC tumors (see “Methods” for sample preparation). IF staining was performed for three immune cell types that are extensively studied with immunohistochemistry: CD8+ T cells (anti-CD8 antibody), natural killer (NK) cells (anti-CD56 antibody), and regulatory T cells (Tregs) (anti-FOXP3 antibody). Notwithstanding that IF is a semi-quantitative technique, we observed significant correlations between IF immune cell infiltration estimates and ssGSEA scores (Fig. 1c). The Spearman correlation for the NK, Treg and CD8+ T cell populations were 0.631 (p = 0.025), 0.639 (p = 0.023), and 0.4998 (p = 0.071), respectively. Higher correlation levels may be precluded by the spatial heterogeneity of immune cell infiltrates and random sampling effects between the tissue sections used for IF staining and RNA-Seq.
We next performed an in silico validation test to ask whether our methodology could successfully infer simulated, i.e. known, mixing proportions of immune cell types at varying noise levels. To this end, we first utilized the RNA-Seq data from sorted tumor-infiltrating cells and generated a reference expression profile for each one of the sorted immune cell populations (CD8+ T cells, NK CD16+ cells, CD4+ T cells, and macrophages) as well as for non-immune CD45– cells (see “Methods”). Next, we simulated the tumor microenvironment by linearly mixing these five reference RNA-Seq profiles: The mixing proportions used in the linear combinations summed to 1 and were simulated from a uniform (0,1) distribution. Two hundred in silico mixture samples obtained in this manner formed the “clean” (i.e. no noise) dataset. To obtain the “noisy” datasets, Gaussian noise was added at signal-to-noise ratios (SNR) ranging from a slightly noisy 10:1 to an extremely noisy 1:2 SNR. Two hundred samples were generated at each noise level. ssGSEA was then run on all mixture samples with the CD8+ T, T helper, macrophage, and NK CD56dim signatures from the Bindea et al. set. We observed that the Spearman correlations between the simulated and inferred mixing levels remained stable and above 0.6 for all four cell types (bootstrap p values < 0.05, see “Methods”) in a long SNR range from 9:1 to 4:1 (Fig. 2a). Given the low noise levels of RNA-Seq relative to microarrays, the actual SNR in an RNA-Seq experiment would likely not be lower than 4:1. Thus, the SNR analysis indicated that ssGSEA-based immune decomposition is robust to the potential technical and/or experimental sources of noise in the system.
The second in silico test involved the validation of the two aggregate scores: IIS and TIS. IIS was validated with leukocyte fractions computationally inferred from available TCGA DNA methylation data in 13 cancer types (see “Methods”). The fractions obtained using this orthogonal data type were highly concordant with the RNA-Seq based IIS. Out of 13 tumor types, 10 exhibited Spearman correlations greater than 0.6 and all 13 had highly significant p values (Fig. 2b, Additional file 1: Figure S6 left column). As expected, IIS levels were often strongly negatively correlated with tumor purity as inferred by ABSOLUTE [36] (Additional file 1: Figure S6 right column). The other aggregate score utilized in this study, TIS, was validated with T cell receptor (TCR) beta chain abundance data computationally inferred from RNA-Seq data in [37]. Out of the 19 tested cancer types, 17 had highly significant correlation values (brain cancers GBM and LGG did not), the majority of which were greater than 0.6 (Fig. 2c, Additional file 1: Figure S7).
We attempted to compare the immune cell scores from CIBERSORT [16] with our ssGSEA scores (see “Methods”) even though CIBERSORT has not yet been validated for RNA-Seq data. We observed that CIBERSORT returned zero for the majority of samples in multiple cell types, whereas ssGSEA by design returns approximately Gaussian values for any signature. This difference coupled with the differences in cell sorting strategies led to poor or moderate correlations for the majority of immune cell populations (Additional file 2: Table S9). In cases where CIBERSORT did not return zeroes and Bindea et al. were attempting to describe the same cells, we observed relatively stronger levels of concordance (CD8 T cells, T follicular helper cells, and Tregs; Pearson r = 0.725, 0.395, 0.353; p value = 6.9e-33, 1.2e-8, 4.6e-7 respectively) (Additional file 2: Table S9).
These independent validation results show that our in silico decomposition is a reliable method to infer immune infiltration levels in tumor samples.
The T cell infiltration spectrum across 19 human cancer types
The TIS and IIS of each sample in the 19 studied cancer types were computed as the sum of the individual scores from the relevant immune subpopulations. We observed that ccRCC and lung adenocarcinoma (LUAD) represented the highest end of the TIS and IIS spectrum (Fig. 3a for TIS and Additional file 1: Figure S8 for IIS). A pan-cancer view of the levels of individual T cell subpopulations that make up the TIS variable is presented in Additional file 1: Figure S9.
Missense mutations within tumor cells are a known source of neo-antigens that can initiate a T cell dependent immune response [38]. Previous studies have reported a significant correlation between “total number of mutations” and cytolytic activity index (CYT) in a pan-cancer context [13]. However, synonymous mutations do not give rise to neo-antigens, therefore the correlations between “number of missense mutations” and CYT are more relevant to study when investigating the immunogenicity of tumor types. We observed that, across 18 cancer types, only glioma and stomach adenocarcinoma had significant correlations between CYT and number of missense mutations after correction for multiple hypothesis testing (Additional file 1: Figure S10c). When only the 5th to the 95th percentile of the missense mutation counts was used as implemented in [13], the number of cancer types with significant CYT versus mutation count correlations increased to a modest four (Additional file 1: Figure S10d).
Consistent with CYT findings, we also observed a lack of consistent positive pan-cancer correlations between TIS levels in tumors and the corresponding numbers of somatic missense mutations (Fig. 3b, top left panel). On the contrary, there was a greater number of tumor types with significant negative correlations between these two variables; an observation which held true for CD8, central memory and effector memory T cells as well (Fig. 3b). One notable exception was colorectal adenocarcinoma (COADREAD) where the hypermutated subpopulation had elevated levels of TIS (r = 0.303, p value = 3.6 × 10–7, n = 271) [39] (Fig. 3a). It is not obvious whether the negative correlations arise due to a direct relationship between mutated neoepitopes and T cells or due to an unknown confounding variable. We speculate that immunoedited [40] tumors which have gone through equilibrium and escape can lead to divergence in the association between mutation burden and T cell infiltration. For instance, tumors which acquire the ability to suppress T cell activation may continue to accumulate mutations as immune infiltration decreases.
In contrast to CD8 and memory T cells, Th2 and Treg cell levels generally showed a positive correlation with mutation load (Fig. 3b). These correlations could be indicative of an immunosuppressive environment enriched in Treg and/or Th2 cells where tumors have escaped elimination by the immune system despite bearing a large number of potentially immunogenic mutations.
Immune infiltration is expected to increase the expression of APM genes in the tumor through paracrine signaling and mRNA generated by the infiltrating cells. Therefore, we investigated the correlation between the TIS and APM scores across the tested tumor types. As expected, the median TIS and the median APM score in the 19 cohorts showed a strong correlation (Spearman r = 0.611, p = 5.5 × 10–3), where ccRCC and LUAD were again among the highest with respect to the within-cohort TIS-APM correlation (Fig. 4a). Cancer types with low within-cohort correlations included GBM, LGG, ACC, and KICH. APM levels in these cancer types are indeed most strongly correlated with macrophages or subpopulations of dendritic cells (activated, immature, or total DCs) (Additional file 1: Figure S24).
Interestingly, a comparison of the APM expression between the tumor and normal tissue for kidney (clear cell, chromophobe, and papillary sub-histologies) and non-small cell lung tumors (adenocarcinoma and squamous cell) revealed that the tumor-normal difference was highly significant for ccRCC (q = 3.1 × 10–38, Mann–Whitney test) and papillary RCC (q = 2.7 × 10–13, Mann–Whitney test) but not significant for other tumor types (Fig. 4b). Notably, the tumor-normal difference for the APM score was the most pronounced in ccRCC compared with 14 other cancer types (Additional file 1: Figure S11) (no normal samples were available in the TCGA dataset for the other cancers). APM expression of ccRCC tumors did not show a positive association with either grade (Spearman r = –0.11, p = 0.02, n = 421) or stage (Spearman r = –0.14, p = 0.004, n = 422). Moreover, the grade-specific and stage-specific differences of APM expression levels were weak (p = 0.0704 and 0.0037, respectively, ANOVA) (Additional file 1: Figure S12). These results indicate that APM upregulation in ccRCC is likely an intrinsic ccRCC phenomenon and not dependent of tumor necrosis or other features associated with aggressive disease.
In a survey of the other immune cell types, we found that the unique features of ccRCC immune infiltration extends to high levels of CD8+ T cells, plasmacytoid DCs (pDC), T cells, cytotoxic cells, and neutrophils; and low levels of Th2 and Treg cells compared with the other 18 cancer types (Additional file 1: Figure S13).
Immune-infiltrate decomposition in ccRCC reveals three distinct patient clusters
In our effort to characterize the microenvironment of ccRCC tumors, we expanded our repertoire of 24 immune cell types to also include an angiogenesis signature [41] (Additional file 2: Table S1) and immunotherapeutic targets PD-1 (PDCD1), PD-L1 (CD274), and CTLA-4 (CTLA4). Angiogenesis is well established to be a characteristic component of immune inflammation [42] and ccRCC is known to have high angiogenic capacity due to constitutive activation of the hypoxia-inducible factor pathway [43]. We confirmed the high angiogenesis levels in ccRCC via a comparison against 18 other tumor types explored in this study (Additional file 1: Figure S13).
Using the ssGSEA scores from the expanded panel of 28 immune-related and inflammation-related gene signatures, we performed unsupervised clustering on the TCGA cohort of 415 patients (see “Methods”). Strikingly, this analysis revealed three distinct clusters that predominantly separated according to levels of T cell infiltration and APM gene expression, here termed the (1) T cell enriched (n = 65, 15.7%), (2) heterogeneously infiltrated (n = 257, 61.9%), and (3) non-infiltrated clusters (n = 93, 22.4%) (Fig. 5a). We observed that the T cell enriched tumors had markedly high expression of granzyme B (GZMB) and interferon-gamma (IFNG), effector molecules prominently associated with T cell response. Despite high levels of T cell infiltration and effector molecules, patients in the T cell enriched class had the poorest cancer-specific survival whereas the non-infiltrated group fared the best (p = 0.05; log-rank test) (Fig. 6a). Coupled with the observation that inhibitory checkpoint molecules PD-1 and CTLA-4 are also expressed at high levels in the T cell enriched class, this finding suggests that effector T cells in the tumor microenvironment may not be able to exert their pro-survival effects due to being offset by inhibitory cells/molecules and factors such as exhaustion and/or anergy.
An orthogonal measurement of tumor purity by the DNA-based ABSOLUTE algorithm [36] confirmed that the non-infiltrated group was the purest cluster (mean 0.640) and the T cell enriched group was the least pure cluster (mean 0.436) (p < 2 × 10–16, ANOVA). We then assessed the stromal content of samples using the RNA-based ESTIMATE algorithm [15] and investigated its association with the clusters. We found that the non-infiltrated cluster demonstrated the lowest stromal scores whereas the heterogeneous and T cell enriched clusters displayed mixed degrees of stromal content (p = 4 × 10–7, ANOVA).
In order to validate that the three immune infiltration clusters are not unique to the TCGA ccRCC cohort, we utilized a separate publicly available dataset of 101 ccRCC tumors for which comparable multi-platform data were available [29] and refer to it as the SATO dataset from here on. A random forest classifier was trained on the TCGA cohort using the ssGSEA scores of 28 immune-related variables. This classifier was used to predict the immune infiltration class for each SATO patient (see “Methods”). The heatmap of the same 28 immune features in the SATO dataset confirmed the existence of the three classes as well as the elevated expression levels of APM, granzyme B, and interferon-gamma in the T cell enriched cluster (Additional file 1: Figure S14a).
To further characterize the clusters’ unique molecular features, we next performed an unbiased analysis of differential gene and protein expression between the clusters. We excluded the signature genes and performed pathway analysis [44] for the genes significantly overexpressed in one of the clusters (q < 5 × 10–5, Mann–Whitney test). We observed that the T cell enriched group had significant overexpression of both adaptive and innate immunity genes (Fig. 5b and Additional file 2: Table S4A). On the other hand, the non-infiltrated group had significant overexpression of metabolism-related and mitochondria-related genes (Additional file 2: Table S4B), while the heterogeneously infiltrated group had overexpression of angiogenesis-related genes (Additional file 2: Table S4C) (q < 5 × 10–5, Mann–Whitney test). These findings were again validated in the SATO dataset (Additional file 1: Figure S14b, Additional file 2: Table S5A–C). We next utilized the TCGA reverse phase protein array (RPPA) dataset for the differential protein expression analysis. We consistently observed overexpression of immune-related proteins, such as Lck and Syk, for the T cell enriched group; and an overexpression of angiogenesis-related proteins, such as Smad1 [45, 46] and c-Kit [47–49], for the heterogeneously infiltrated group (q < 0.01, Mann–Whitney test) (Fig. 5c). A proteomic dataset for the SATO cohort was not available.
PCA on the ccRCC immune infiltration scores showed that the three clusters defined above cannot be explained by a one-dimensional infiltration gradient and most likely reflect distinct biology (Fig. 5d). Even though non-infiltrated and heterogeneously infiltrated tumors are not as well distinguished from each other as they are from the T cell enriched group, the evidence from differential gene and protein expression analyses indicate that these clusters are likely distinct as they have unique biology with respect to pathways such as those in angiogenesis and mitochondria/metabolism.
The T cell enriched cluster in the TCGA dataset exhibited two subclusters, here termed TCa (n = 39, 60%) and TCb (n = 26, 40%) (Additional file 1: Figure S15a), with different immune cell infiltration and gene expression profiles. Gene set enrichment analysis with DAVID [44] and ClueGO [50] revealed that the genes overexpressed in TCa (q < 5 × 10–5, Mann–Whitney test) were associated with metabolic and mitochondrial processes (Additional file 1: Figure S15b, Additional file 2: Table S5A). The genes overexpressed in TCb (q < 5 × 10–5, Mann–Whitney test) were enriched for processes related to cell cycle, extracellular matrix (ECM), and cellular proliferation (Additional file 1: Figure S15b, Additional file 2: Table S5B). We also found that these two subclusters had prognostic differences (Additional file 1: Figure S15c), with the TCb patients having worse cancer-specific survival than the TCa patients (p = 0.0162, log-rank test). Moreover, the TCb subcluster had significantly higher macrophage infiltration (p = 5.7 × 10–4) and stromal score (p = 4.6 × 10–4, Mann–Whitney test) with a moderate correlation between these two variables (Spearman r = 0.418, p = 5.8 × 10–4, n = 65). This correlation generalized to the entire TCGA ccRCC cohort (Spearman r = 0.561, p < 2 × 10–16, n = 415), suggesting the possibility of macrophage recruitment by stromal cells [51] (Additional file 1: Figure S16). These results confirm the biologically distinct characteristics of the TCa and TCb subclusters within the T cell enriched group.
T cell infiltration levels are associated with clinical outcomes
We found that tumor immune-infiltration in ccRCC was associated with distinct clinicopathologic features. Male patients (p = 0.018), higher stage (p = 0.006), and higher grade (p = 0.003) tumors were over-represented in the T cell enriched class compared to the non-infiltrated and heterogeneously-infiltrated groups (Fisher’s exact test). We next investigated the univariate significance of each T cell subset and angiogenesis as a predictor of cancer-specific survival. Cox proportional-hazards regression showed that, in both the TCGA (n = 415) and SATO (n = 101) datasets, the levels of Th17 cells and angiogenesis were strongly associated with favorable outcomes, whereas Th2 and Treg cells were associated with adverse outcomes (Fig. 6b) consistent with previous reports [18, 41, 52–55]. To optimize prognostic discrimination, we explored Th17 ratios with other immune subtypes and identified the Th17/Th2 ratio as the most predictive in both the TCGA and SATO cohorts (Fig. 6b, c). Moreover, we observed that CD8+ T cell levels alone were not significantly associated with improved survival in the TCGA cohort, but the frequently used CD8+ T/Treg ratio was (Fig. 6b, c).
Additional analyses demonstrated that previously identified prognostic features such as tumor stage and molecular ccRCC subtype (ccA/ccB) [56] were associated with similarly prognostic immune infiltration scores. In particular, Treg and Th17 infiltration levels had negative and positive association, respectively, with tumor stage (q = 6.1 × 10–8 for both, ANOVA) (Additional file 1: Figure S17). Treg and Th2 infiltration levels were higher in ccB (n = 175) subtype tumors (q = 3.9 × 10–9 and 1.2 × 10–8, Mann–Whitney test) compared with ccA (n = 205), which was previously shown to have better prognosis relative to ccB [56] (Additional file 1: Figure S18). In contrast, Th17 and CD8+ T cell infiltration levels were higher in ccA tumors (q = 2.8 × 10–12 and 5.8 × 10–6, Mann–Whitney test).
Association of immune infiltration patterns with intratumor heterogeneity and subclonality
We next investigated whether the immune infiltration classes predicted by our mRNA-based decomposition algorithm were robust to intratumoral heterogeneity. We obtained a microarray gene expression dataset from the Gerlinger et al. [57] ccRCC multiregion tumor study (referred to as GERLINGER from here on). This dataset includes 56 tumor and six normal samples from nine ccRCC patients. The authors sampled several tumor regions from each patient to investigate intratumor heterogeneity. We computed the ssGSEA-based immune cell infiltration scores and also the aggregate TIS for these samples, and applied the TCGA-based random forest classifier to predict the immune infiltration class for each sample (Fig. 7a). Interestingly, tumors with high T cell infiltration levels (RK26, RMH002) had highly similar immune infiltration profiles in most sampled regions; and all regions were predicted to be in the T cell enriched category. In contrast, tumors with relatively lower levels of T cells showed immune intratumor heterogeneity and had regions predicted to be in multiple different immune infiltration categories. For instance, regions in tumors RMH008 and EV007 were found to contain members in all three immune infiltration classes (T cell enriched, heterogeneously infiltrated, or non-infiltrated).
T cell receptor β-chain (TCRb) read counts from ultra-deep TCR sequencing and total T cell counts from immunohistochemistry (IHC) were also available for a subset of the GERLINGER microarray samples (n = 6) [58] (Additional file 2: Table S7). These two types of T cell abundance estimates have previously been shown to have a statistically significant correlation across 14 samples [58], despite RMH002-R6 being a strong outlier in terms of IHC-based T cell counts. We observed that the significance of the correlation was lost when the analysis was restricted to the six samples that also had microarray data (Fig. 7b, left panel) regardless of whether RMH002-R6 was included in the correlation computation (p = 0.15 and 0.089 with and without RMH002-R6, respectively). However, the ssGSEA-based TIS had at least borderline significant correlation with both of these variables despite the small number of samples (p = 0.047 for the correlation with IHC-based T cell counts and 0.057 for the correlation with total TCRb read counts) (Fig. 7b, middle and right panels). Moreover, the scatter plots with TIS interestingly showed that the IHC-based T cell count for RMH002-R6 was not an outlier (Fig. 7b, middle panel), but the total TCRb read count for the same sample was (Fig. 7b, right panel) (p value increases from 0.057 to 0.021 for the correlation between TIS and total TCRb read count when RMH002-R6 is removed). This finding suggested that the discordance of the T cell abundance estimates for this sample may not be due to an over-representation of T cells in the FFPE section as speculated in [58], but may be due to an underperformance of the steps involved in ultra-deep sequencing of TCRb reads from bulk tumor DNA. Yet, this is not certain as spatial heterogeneity of T cell infiltrates and random sampling effects confound all such comparisons.
A recent study on NSCLC reported an inverse relationship between T cell infiltration and subclonal architecture [59]. We performed clonality assessment on the TCGA ccRCC cohort using SciClone [60] (see “Clonality assessment” in “Methods”); and consistent with the NSCLC study, found that more clonal tumors (i.e. tumors with fewer subclones) had higher levels of CD8+ T cells, cytotoxic cells, APM, and TIS (Fig. 7c). Clonality for the SATO ccRCC cohort was also assessed using SciClone (see “Methods”) and the trends for the inverse association between immune infiltration and subclonal architecture were recapitulated in this dataset although p values did not reach significance and the trends were rather modest (Additional file 1: Figure S19). Both the TCGA and SATO results held true even when the immune scores were adjusted for purity (Additional file 1: Figure S20).
Baseline elevation in TIS and APM in ccRCC patients responding to nivolumab
Given the relationships we have identified between distinct immune cell subsets, APM, and clinical status, we next used RNA-Seq to ask whether there is a relationship between the baseline immune landscape and response to immunotherapy. Nivolumab (anti-PD-1) is FDA-approved for the treatment of advanced RCC, so we investigated the pretreatment immune profile of patients treated with this agent using a hypothesis-generating set of six patients. We found that both TIS and APM were elevated in responding patients (those with a partial or complete response to nivolumab) whereas they were in the lowest quartile for patients with progressive disease on nivolumab (Fig. 8). A similar pattern was observed when examining the relative expression of T cell effector genes IFNG and GZMB. This correlation should be substantiated in a larger cohort to determine if it has predictive power in determining response to PD-1 blockade.
Association of immune infiltration with genomic alterations and neo-antigens
In light of our evidence suggesting the presence of immunologically distinct subsets of ccRCC tumors, we investigated mutation load and recurrent genomic alterations as potential drivers of the observed T cell infiltration. The tumors from the non-infiltrated class harbored slightly more somatic missense mutations than the T cell enriched class (the median number of somatic missense mutations in the non-infiltrated group was 36.5 versus 33 in the T cell enriched group; q = 0.07, ANOVA). Out of the 11 driver genes commonly mutated in ccRCC, only PBRM1 was mutated at significantly different rates between the three populations (Additional file 1: Figure S21a; higher in non-enriched versus T cell enriched q = 0.04; higher in heterogeneous versus T cell enriched q = 0.04; Fisher’s exact). However, this observation was not validated in the SATO dataset. None of the common arm-level CNVs observed in ccRCC tumors were found at different rates between the three groups (Additional file 1: Figure S21b).
Cancer neo-antigens have been demonstrated to drive T cell infiltration of tumors in murine models of cancer [38, 61]. We hypothesized that the abundance or quality of cancer neo-antigens might differ between our tumor classes. To address this theory, we determined the HLA-A, HLA-B, and HLA-C alleles of each ccRCC TCGA patient using OptiType [62]. We then predicted the protein alterations expected to result from missense mutations in each tumor and identified those predicted to bind to MHC-I molecules (see “Methods”). We found no significant difference in the median MHC-I binding count (Additional file 1: Figure S22a) or median binding affinity (Additional file 1: Figure S22b) of neo-antigens between the three classes of TCGA tumors. We also found no significant difference in the fraction of tumors with non-silent somatic mutations in an expanded set of APM genes (Additional file 2: Table S8A-C). These results suggest that factors other than genomic alterations may be contributing to the immune infiltration of ccRCC tumors.
ImmunExplorer web application
We have created a publicly available web application (http://kidneyimmune.chenghsiehlab.org/) that allows users to interactively visualize and perform integrated analysis of immune cell type levels, RNA-Seq, and clinical outcomes from the TCGA and Sato ccRCC datasets.