ExportPred: Training sets and validation. (a) Boxplots of scores of two positive sequence sets and five negative sequence sets. The chosen score threshold of 4.3 is marked. Both positive sets are well separated from all negative sets. Poorly scoring outliers in the postive sets can largely be ascribed to incorrect gene models and Rif and Stevor pseudogenes. (b) Two-dimensional plot of P. falciparum proteins decomposed by scores of the ExportPred states for the PEXEL motif and for the signal sequence. Small black dots indicate proteins with full model scores <4.3 and blue dots with scores ≥ 4.3. The three positive and four negative GFP fusions described are marked with green and red dots, respectively, and the nine yellow dots are, from left to right, RESA, HRPIII, KAHRP, PFA0475w (Rifin), R45, MESA, PfEMP3, PFC0025c (Stevor), and GBP130. (c) Experimental verification of a number of ExportPred predictions above (green) and below (red) the chosen threshold. GFP fusions to three positive predictions (PFI1780w, PFE0055c, PFI1755c) are exported successfully into the red blood cell cytosol. Fusion proteins to three negative predictions (PFE0360c, PF14_0607, PFE0355w) accumulate in the parasitophorous vacuole, indicating a functional signal sequence but no functional export motif. One GFP fusion (PF10_0321) appears to be targeted to the mitochondrion. ExportPred scores are indicated in parentheses.