Skip to main content
Fig. 3 | Genome Biology

Fig. 3

From: BamQuery: a proteogenomic tool to explore the immunopeptidome and prioritize actionable tumor antigens

Fig. 3

New insights into the immunopeptidome biology. a–h Published MAPs reported as canonical (n = 1702) and non-canonical (ncRNA (n = 378), intronic (n = 114), and EREs (n = 232)) were searched with BamQuery in GTEx tissues and mTEC bam files in unstranded mode (GTEx data being unstranded) with genome version GRCh38.p13, gene set annotations release v38_104, and dbSNP release 151. Panels a, e, f, g were generated with the comparison of 9-mers only (n = 1211 canonical, n = 207 ncRNA, n = 68 intronic, n = 157 EREs) to prevent possible biases introduced by variable frequencies of 8/10/11-mers among the compared groups. Figures b, c, h were generated with the complete MAP dataset (n = 1702 canonical, n = 378 ncRNA, n = 114 intronic, n = 232 EREs). Mann–Whitney U test was used for indicated comparisons (*p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001). a Number of possible MCS after reverse-translation of indicated MAP groups. b Average frequency (%) of amino acids encoded by the indicated number of synonymous codons in indicated MAP groups. c Heat map of amino acid frequency in indicated MAP groups. d Mean of the MCS average usage frequency of codons (among 1000 codons located in human reference protein-coding sequences) encoding each of the 20 amino acids of indicated MAP groups. Codon frequencies were obtained from the codon usage database (http://www.kazusa.or.jp/codon/). e Number of MCS genomic locations able to code for the indicated MAP groups. f Pearson’s correlation between the number of possible MCS after reverse translation vs. the number of MCS genomic locations able to code for the assessed ERE MAPs. The red line is a linear regression. g Percentage of MAPs attributed to indicated biotypes by BamQuery based on the best guess (left) or EM-established (right) biotype ranks, and the genomic regions expressed in GTEx tissues and mTECs. The X-axis indicates the biotype reported in the original study (groups). For clarity, BamQuery biotypes were summarized into five general categories: protein-coding regions, non-coding RNAs, EREs, intronic and intergenic. h Percentage of the most likely biotype attributed by BamQuery to EREs MAPs

Back to article page