Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: An analysis of proteogenomics and how and when transcriptome-informed reduction of protein databases can enhance eukaryotic proteomics

Fig. 4

Transcriptome-informed reduced databases yield less ambiguous protein identifications. A Number of valid identifications obtained from the full (red) or reduced (blue) target-only database searches, followed by the BH procedure for 1% FDR control. The number of valid spectra, peptide, and protein identifications is reported. Protein groups, as defined by the Proline software, represent protein identifications and include (i) proteins unambiguously identified by only specific peptides (single-protein protein groups) and (ii) groups of proteins identified by the same set of shared peptides (multi-protein protein groups). B Percentage of single-protein groups. C Bipartite graph representation of peptide-to-protein mapping and exploitation of graph connected components to visualize and quantify the ambiguity of protein identifications. Unambiguous protein identifications are represented by CCs with a single protein vertex (single-protein CCs), while proteins sharing peptides are grouped in the same CC (multi-protein CCs). D Upper panel: total number of connected components. Lower panel: percentage-specific peptides and single-protein CCs. E Genes encoding proteins from the full and reduced database searches. Upper panel: total number of genes associated with protein matches in the two searches. Lower panel: ratio between the number of protein members in each multi-protein CC and the number of genes encoding them

Back to article page