To deepen our insight into nuclear biology in P. falciparum we performed a comprehensive proteomic analysis of the parasite nucleus and detected a total of 1,273 proteins in sequential extracts of crude nuclei. The proportion of predicted nuclear proteins in this set was estimated at 46%, which compares well with two similar studies in S. cerevisiae. Mosley et al. detected 2,674 yeast proteins in crude nuclei, 46% of which are annotated as nuclear proteins in the Saccharomyces genome database SGD , and a second study based on sucrose gradient purification of nuclei detected 1,889 proteins, 55% of which were annotated as nuclear in SGD .
The major sources of contamination in nuclear preparations are membrane fractions of non-nuclear organelles, particularly the ER. To eliminate such likely contaminants we applied an informed bioinformatic filtering approach that resulted in a markedly improved positive predictive value (76%) for true nuclear proteins in the core nuclear proteome. Importantly, we confirmed the specificity of this set by in-vivo experimental validation. Out of 22 candidates 18 localized in different patterns to the nucleus. Some proteins displayed a rather ubiquitous distribution within the nucleus, whereas others localized to more restricted, undefined subnuclear regions, to the nucleolus, or to the nuclear periphery. These alternative localizations are a likely consequence of the different functions these factors carry out in the nucleus. Four candidates localized primarily to the cytoplasm, however, some of them may also be nuclear. First, we determined localizations using episomally expressed epitope-tagged proteins, which may cause over-expression and cytosolic accumulation of some candidates. Second, proteins primarily located in the cytosol may still be present in the nucleus in lower concentrations. For instance, PfNAPL (NuProC21), which localized to the cytoplasm here and in a previous study , is a putative orthologue of ScNAP1 that shuttles between the nucleus and cytosol . Another example is PFL0450c (NuProc22), a protein with unknown function that localized to the cytoplasm in ring and trophozoite stages but was found at the nuclear periphery in late schizonts.
Functional classification of the core nuclear proteome
We detected 299 proteins (37.3%) implicated in nuclear processes such as transcription, chromatin remodeling, DNA replication/repair, RNA binding and processing, ribosome biogenesis, but also in more general processes such as protein folding and modification, protein degradation, translation, cytoskeleton organization, and metabolism. This classification is based on direct experimental evidence, literature review, and/or sequence similarity to known S. cerevisiae nuclear proteins (Additional file 10). A total of 118 proteins (14.7%), for which such evidence is missing, carry annotations similarly consistent with roles in DNA/chromatin interactions, cell cycle control, mitosis, RNA binding and processing, protein folding and modification, protein degradation, cytoskeleton organization, and metabolic processes. Ninety-five proteins (11.8%) represent ribosomal subunits or translation-associated factors, and 35 proteins (4.4%) are predicted to localize to other compartments such as the mitochondrion, ER/Golgi, protein sorting vesicles, or the vacuole. Finally, the largest fraction of the core nuclear proteome consists of 255 proteins of unknown function (31.8%).
To date, only three proteins have been localized to the parasite nucleolus (RNA polI, PFE0465c; fibrillarin/PfNOP1, Pf14_0068; PfNOP5, PF10_0085) [83, 84]. Further, only six P. falciparum proteins are annotated with the GO term 'nucleolus', and 36 are annotated by GeneDB  as putative nucleolar proteins. Here, we detected 12 of these proteins including fibrillarin, PfNOP5, RNA methyltransferase (PF11_0305), putative ribonucleoprotein (RNP) components such as LSM homologs (PFL0460w, PF08_0049), and several predicted pre-ribosomal assembly proteins, as well as another 19 putative homologs of S. cerevisiae nucleolar proteins (Additional file 10). Notably, two of these candidates, NuProC2 (PF10_0278) and NuProC3 (PF11_0250), co-localized with fibrillarin to the parasite nucleolus. These novel identifications expand our current knowledge of the P. falciparum nucleolus and provide a basis for detailed analyses of this essential nuclear compartment.
Plasmodium protein kinases (PK) play central roles in growth, development, and differentiation throughout the life cycle, and are intensely studied as a class of promising antimalarial targets [86, 87]. We identified kinases or their accessory factors of seven PK systems, six of which have been implicated in nuclear roles in Plasmodium or other eukaryotes: PfMAT1 (PFE0610c) ; the α- and β-subunits of casein kinase 2 (PF11_0096, PF11_0048) [89–91]; casein kinase 1 (PF11_0377) [92, 93]; the catalytic and regulatory subunits of cAMP-dependent protein kinase (PFI1685w, PFL1110c) ; NIMA-related protein kinase PfNEK-1 (PFL1370w) [95, 96]; and mitogen-activated protein kinase 2 PfMAP2 (PF11_0147) [97, 98]. PfMAP2 is a substrate of PfNEK-1  and both kinases are essential for completion of the IDC in P. falciparum [95, 99], whereas in P. berghei PbMAP2 is implicated specifically in the development of male gametes . Interestingly, while the targets of these and other nuclear kinases remain largely unknown, we find that 70% of proteins in the core nuclear proteome (562/802) were recently shown to be phosphorylated  (Additional file 10) suggesting an important role for kinase signalling in the regulation of nuclear processes in P. falciparum.
We detected 32 of the 33 annotated P. falciparum proteasome components and all, except two of the 19S regulatory particle (RP) subunits, were represented in the cytoplasmic fraction. Intriguingly, all subunits of the RP were also associated with the insoluble nuclear fraction in ring stage parasites (Additional file 10). In other species, the RP interacts with chromatin and is involved in transcriptional regulation . For example, the S. cerevisiae RP participates in SAGA histone acetyltransferase complex recruitment , interacts with the FACT (facilitates chromatin transcription) complex , and influences H3 methylation and gene silencing . Hence, we speculate that the 19S RP may have similar non-proteolytic roles in P. falciparum transcriptional regulation.
The nuclear proteome also contained several orthologues of the mRNP complex implicated in translational repression during P. berghei gametocytogenesis [105, 106], such as the RNA helicase PfDOZI (PFC0915w), PfCITH (PF14_0717), the RNA-binding proteins PfHOBO (PF14_0096) and PfHOMU (PFI0820c), and a homolog of a yeast poly(A)-binding protein (PFL1170w). Although DOZI and CITH were described as cytoplasmic proteins in P. berghei, their localizations do not appear to exclude the nucleus. Furthermore, homologs of PfHOMU and PFL1170w shuttle between the cytosol and nucleus in mammalian cells and S. cerevisiae, respectively [107, 108]. The detection of these proteins in the asexual nuclear proteome suggests the presence of translational repression machinery during the IDC, although P. berghei DOZI and CITH loss-of-function mutants have no apparent phenotype in asexual parasites [105, 106].
We also discovered novel domains that are likely linked to previously unrecognized nuclear functions of parasite proteins, and we envisage their roles will be more thoroughly recognized in future studies. Of particular interest is the identification of the ACDC domain, which we identified exclusively in members of the ApiAP2 family. Moreover, we experimentally validated the identification of PF14_0442 as a novel subunit of P. falciparum nuclear pores. Testing this in more detail will be an important future task given that Plasmodium parasites lack identifiable orthologues of most nuclear pore components .
Stage- and fraction-specific aspects of the core nuclear proteome
We made several interesting observations regarding stage- and fraction-specific protein profiles that allow us to speculate about potential temporal and spatial expression patterns of nuclear proteins. In light of the stochastic undersampling of complex proteomes in shotgun proteomics approaches and possible sample-to-sample variation, however, these data have to be interpreted with caution.
A total of 58, 90, and 105 proteins were found exclusively in ring, trophozoite, and schizont stages, respectively. In trophozoites and schizonts, expression of these proteins occurred roughly in line with transcription of the encoding genes (Additional files 1 and 23). mRNA expression of ring stage-specific nuclear proteins was somewhat surprising, with a collective profile similar to that observed in schizonts, and an additional smaller peak at 15 to 20 hpi. This suggests that some proteins were newly synthesized in the ring stage while some remained from the preceding schizont stage. A total of 159 of the 253 stage-specifically detected proteins carry no functional annotation and several others carry annotated domains that indicate little about their function. Hence, a large number of novel proteins have been assigned a potential stage-specific role in nuclear biology where previously no informative annotation was available. Further, 13 predicted TFs  were detected specifically in a single IDC stage only, indicating a role for these factors as cell cycle-specific regulators of transcription and genome regulation (Additional file 24).
Of 145 proteins predicted to interact with DNA and/or chromatin, 137 (94.5%) were identified in the chromatin-containing fractions (DNAseI-, high salt-, and/or SDS-soluble fractions) (Additional file 10). Some of these factors, such as histones, SNF2 helicase (PFF1185w), chromodomain-helicase-DNA-binding protein 1 (CHD1) (PF10_0232), putative chromosome assembly factor 1 (PFE0090w), and the nuclear peroxiredoxin PfnPRX (PF10_0268)  were detected in all three chromatin-associated fractions. In contrast, nine out of the 10 RNA pol II subunits identified were exclusively detected in the high salt nuclear extract. Both FACT components (PFE0870w, PF14_0393) extracted almost identically only after high salt and SDS extraction. The two recently described high mobility group box proteins (PFL0145c, MAL8P1.72)  were identified in DNAse1- and high salt-soluble extracts but not in the SDS fraction. These examples suggest that at least some regulatory complexes were extracted as interacting entities. Interestingly, all ApiAP2 factors showed a noticeable association with the insoluble nuclear fraction. While the reason for this remains unknown our observation hints at possible functions of these DNA-binding proteins. Most ApiAP2 proteins are large and, apart from the short and well-defined DNA-binding AP2 domains, consist of extensive uncharacterized regions. It is possible that these non-AP2 regions may mediate the formation of regulatory complexes involved in diverse processes such as DNA replication, transcriptional regulation, or functional organization of the genome. Such complexes are often resistant to extraction with DNAseI and high salt buffers, as observed for many RNA- or DNA-binding proteins associated with the nuclear matrix . In case of PfSIP2, the only P. falciparum ApiAP2 characterized in vivo, the detection of PfSIP2-derived peptides in the insoluble nuclear fraction is consistent with the association of this factor with condensed heterochromatin .
A large number of proteins in the core nuclear proteome (379) were also detected in the cytoplasmic fraction. This pool of proteins contained members of all functional classes but was clearly enriched in distinct pathways associated with the various functions of the nucleolus in other eukaryotes [113–115]. These include the majority of ribosomal subunits (90.5%), RNA-binding proteins (78.9%), factors involved in protein degradation (69.0%), and protein folding and modification (65.1%). Furthermore, 84.4% of translation-related factors were identified in the cytoplasmic and nuclear fractions, a finding consistent with the existence of nuclear translation [116, 117]. We also noticed that 42% (62 proteins) of confirmed or likely nuclear proteins in the core nuclear proteome were detected in the cytoplasmic fraction. Moreover, six proteins experimentally localized to the nucleus in this study had peptides detected in the cytoplasmic fraction. Hence, while some of the dually detected proteins may represent cytoplasmic contaminants such as abundant cytosolic proteins or macromolecular complexes, our results show that many nuclear proteins in P. falciparum shuttle between the nucleus and cytosol and/or perform their tasks at multiple destinations.
Nuclear import in P. falciparum
Transport of proteins into the nucleus remains poorly understood in apicomplexan parasites . In yeast and mammalian model systems short, arginine-, and lysine-rich cNLSs are thought to be the major mediators of nuclear import, though alternative and redundant mechanisms have been described . The poor enrichments of cNLSs in both the core nuclear proteome and the set of 317 curated nuclear proteins show that bioinformatic discrimination of nuclear vs. non-nuclear P. falciparum proteins via prediction of cNLSs remains impractical. Notably, however, cNLS predictors are also problematic in the reliable identification of nuclear proteins in the model systems they were designed for . Nevertheless, differences between P. falciparum nuclear proteins and these current computational models of cNLSs appear prevalent but remain unclear. It has not been determined whether the majority of P. falciparum nuclear proteins are imported independently of importin α, the protein that binds cNLSs , or alternately by similar but unrecognized cNLSs that do mediate importin α-dependent translocation. The fact that only 22% to 51% of nuclear proteins are predicted to contain cNLSs (depending on the predictor used) suggests that in its current definitional form the cNLS is not the major mode of nuclear import in P. falciparum. Notably, the important insight that most verifiable Plasmodium nuclear proteins lack a recognizable nuclear localization sequence, and need thus be identified through empirical strategies, reinforces the value of an experimentally robust nuclear proteome for understanding Plasmodium nuclear biology.
Lineage-specific nuclear proteins
Using a combination of OrthoMCL and synteny analyses (Additional files 1 and 25) only around 10 proteins are genuinely falciparum-specific. Unsurprisingly, nearly all of them have no functional annotation, though one is the gametocyte-specific protein Pfg27 (PF13_0011). Pfg27 binds RNA and has previously been localized to the cytoplasm and nucleus of gametocytes . Interestingly, we detected another nuclear P. falciparum-specific protein of unknown function (PFB0115w) that is expressed during the IDC and in gametocytes and contains C-terminal homology to the Pfg27-specific fold  suggesting a functional relation between these two proteins.
The pool of around 100 genus-specific proteins represents a promising group from which to characterize features of nuclear biology peculiar to Plasmodium. Several are potentially worthy of prioritized treatment; these include some of the previously characterized ApiAP2s , as well as a large number of uncharacterized proteins with nucleic acid-binding and chromatin-interacting domains. Several others in this group have kinase or phosphatase domains and may be involved in expression regulation cascades. One of the Plasmodium-specific nuclear proteins is a putative metacaspase (PF14_0363). Interestingly, unrelated trypanosomatid parasites also possess nuclear localized metacaspases that are required for proliferation, possibly through modulation of cell cycle progression .