Changes in protein expression during honey bee larval development

Background The honey bee (Apis mellifera), besides its role in pollination and honey production, serves as a model for studying the biochemistry of development, metabolism, and immunity in a social organism. Here we use mass spectrometry-based quantitative proteomics to quantify nearly 800 proteins during the 5- to 6-day larval developmental stage, tracking their expression profiles. Results We report that honey bee larval growth is marked by an age-correlated increase of protein transporters and receptors, as well as protein nutrient stores, while opposite trends in protein translation activity and turnover were observed. Levels of the immunity factors prophenoloxidase and apismin are positively correlated with development, while others surprisingly were not significantly age-regulated, suggesting a molecular explanation for why bees are susceptible to major age-associated bee bacterial infections such as American Foulbrood or fungal diseases such as chalkbrood. Previously unreported findings include the reduction of antioxidant and G proteins in aging larvae. Conclusion These data have allowed us to integrate disparate findings in previous studies to build a model of metabolism and maturity of the immune system during larval development. This publicly accessible resource for protein expression trends will help generate new hypotheses in the increasingly important field of honey bee research.


Background
Honey bees (Apis mellifera) have been a subject of scientific research for more than 2,300 years [1], yet it is only in the past two decades that bee research has expanded beyond behavioral or social traits to a molecular level. With the publication of the honey bee genome in 2006 [2], the basic information to enable proteome-level analyses of this organism is now available. Since then, various groups have published proteomic analyses of whole bees or individual organs/tissues [3][4][5][6] but these studies have focused on adult animals. Larval development in honey bees is largely unexplored, despite its significance in caste determination [7] and in the pathogenesis of certain economically significant honey bee diseases, such as American and European Foulbrood.
The larval development of the honey bee, which follows a 3day period as an egg, is 5-6 days in duration and precedes the pupal (metamorphosis) and adult stages. Apart from an astounding increase in size, larval growth is relatively unremarkable at the macroscopic level [8]. However, female bees differentiate into workers or queens (caste differentiation) in response to diet very early in larval development and the acquisition of immunity to certain diseases during this 5-to 6-day period suggests complex molecular biological changes are taking place.
Insect development has been studied mainly using the fruit fly as the model system. Drosophila embryogenesis has historically attracted far more attention than any other growth stage, due to its value for studying the mechanism of spatial regulation of transcription and translation. With the exception of the economically important silkworm Bombyx mori, research on larval development has been slow. For honey bees, the lack of published works is evident: the article entitled 'Morphology of the Honeybee Larva' published by Nelson in 1924 [8] still remains today as one of the most cited resources on this subject. Here we have used mass spectrometry-based proteomics to profile the changing abundance of individual proteins over the first 5 days of the worker larval stage and used these data, with the help of sequence-based function prediction, to build a framework for the developmental processes going on in the maturing larva.

Results
In order to obtain suitably aged larval samples for proteomic profiling of the first 5 days of development, for each experiment we isolated an open-mated, laying queen on an empty frame of brood comb for a short period of time to allow her to lay several hundred eggs (see Materials and methods). The frame and queen were then separated by a queen excluder and workers were allowed to tend the brood. Starting on the day the eggs hatched (day 1, roughly corresponding to first instar) larvae were collected every day for 5 days. Hemolymph was separated from the remaining tissues (termed 'solid tissues' henceforth) prior to protein extraction (see Materials and methods) and equal amounts of protein from each age were resolved on a reducing SDS polyacrylamide gel ( Figure 1). The protein composition of solid tissues was grossly consistent across all ages, but varied drastically in the hemolymph. Hemolymph from 1-to 3-day old larvae show a staining pattern distinct from that of 4-to 5-day old larvae. These differences may be partially attributed to slight variations in collection methods for young and old larvae but it is more likely that these represent real biological changes occurring as the late larvae prepare for pupation. Most notably, a 70 kDa hexamerin band emerges from day 3 and beyond and accounts for the majority of the protein in the hemolymph, an observation that has been made numerous times by other researchers [9][10][11]. A second observation that argues against these dramatic changes around day 3 being simply an artifact of sample collection is the absence of the major protein bands in the hemolymph gel in the solid tissue gel, and vice versa.
As a means for identifying and quantifying the expression profiles of proteins in developing larvae, we used a quantitative proteomics approach employing stable isotope labeling and liquid chromatography-tandem mass spectrometry. The labeling method we used employs deuterated and hydrogenated forms of formaldehyde to reductively dimethylate primary amines in peptides, but since there are only two labeling conditions possible in this schema, we compared the expression of protein from days 1, 2, 4 and 5 larvae versus that from day 3 in order to generate an expression profile spanning the whole development period. Three biological replicates of each tissue type were analyzed, which resulted in the detection of 12,421 non-redundant peptides (supplementary Table 10 in Additional data file 1). After applying the cutoff criteria (see Materials and methods), 1,333 proteins were identified (supplementary Table 1 in Additional data file 1) with an estimated false discovery rate of 0.97% (see Materials and methods), thus providing experimental evidence for 12.7% of the 10,517 genes in the predicted honey bee gene set. In general, the peptide ratios showed no labeling bias and were approximately normally distributed ( Figure 1). Among these, 790 were quantified in 2 or more days by averaging the intensity ratio from at least 2 of the 3 replicates (if more than 5 peptides were quantified, the top 5 most intense peptides were selected): 378 (48%) of them matched this criterion in both the tissue and hemolymph, 309 (39%) were specific to solid tissue and 103 (13%) were specific to hemolymph. An example of using peptide ratios to derive relative protein expression profiles is shown in Table 1 for the odorant binding protein 14 [GenBank:94158822].
A major strength of this method is the ability to track the changing abundances of hundreds of proteins during development. Those whose levels can be traced for at least 4 out of 5 days in either the tissue or hemolymph were considered to have an informative profile, a total of 522 proteins. Approximately equal numbers of tissue proteins showed an expression trend either positively or negatively correlated with age, The peptide ratios within an experiment are roughly normally distributed and show no labeling bias Figure 1 The peptide ratios within an experiment are roughly normally distributed and show no labeling bias. Using replicate number 1 of day 1 versus day 3 solid tissue quantification data as an example, the peptide ratios are displayed as a histogram, sorted into natural-log unit bins (bin size = 1). 1400 L e s s t h a n -3 G r e a t e r t h a n 3 . 5 0 0 but the latter was more common for hemolymph proteins, as might be expected from the high dynamic range of hemolymph as shown in Figure 2. It is crucial to note that the decreasing trend likely does not reflect an absolute reduction in expression levels of most proteins, but is rather a phenomenon of analyzing equal amounts of protein between two samples with a very large difference in absolute protein amounts caused primarily by drastic increases in secreted hexamerins. Consequently, lower abundance proteins become harder to detect in this background. Although the protein concentration in hemolymph changes only slightly beyond 1 day after hatching, the total volume, and thus absolute protein content, increases exponentially with age ( Figure 3).
There is no direct functional information available for more than 99% of honey bee proteins, so to derive some functional insight from the data acquired here we used BLAST2GO [12] to systematically predict function based on sequence similarity (supplementary Table 2 in Additional data file 1). After grouping specific molecular function ontologies into broader categories until they converged under one term (supplementary Table 3 in Additional data file 1), the third-level terms Both the honey bee larval hemolymph (H) and tissue (T) samples were collected daily for 5 days post-hatching, and peptides from days 1, 2, 4, and 5 were isotopically labeled and mixed at 1:1 (by protein amount) with day 3 peptides, which were labeled differentially from the other days. Relative peptide intensities were recorded (limited at 50-fold or 3.91 in natural log (Ln)) and proteins with a minimum of 2 quantified peptides were natural log-transformed and averaged; for proteins with greater than 5 peptides, the top 5 most intense ones were selected for averaging. In samples where there were less than 5 peptides, their absence is indicated by NA (not available).
were analyzed in detail. To find whether a given function term was developmentally regulated, an average expression profile was generated using data from proteins belonging under each term and tested for significance at the p < 0.05 level (see Materials and methods). The slope between day 1 and day 5 was calculated to approximate the directionality and strength of temporal correlation. In the 34 terms considered, 11 of them had activity profiles that satisfied the significance criteria in at least one of either the solid tissue or hemolymph expression profiles ( With the current lack of a thoroughly curated protein function database for the honey bee, we manually assigned functional categories by employing a variety of available bioinformatic tools (see Materials and methods, and supplementary Table 5 in Additional data file 1). This is necessary because certain major classes of honey bee proteins, such as hexamerins and odorant binding proteins, do not have high enough homology to proteins in other better annotated organisms and would thus be ignored. Furthermore, most proteins were assigned to multiple terms, or two very similar proteins were assigned to different but similar terms ('nucleic acid binding' [GO:0003676], and 'translation factor activity, nucleic acid binding' [GO:0008135]), which greatly complicates downstream hierarchical clustering and enrichment analysis. Groups that showed a significant temporal regulation (criteria nearly identical to the analysis of level 3 molecular function GO terms) are shown in Table 3 (details in supplementary Table 6 in Additional data file 1). A common protein expression pattern within a group was frequently observed. Ribosomal protein levels in the tissue were consistently lowest at day 2 and day 5, but overall decreased in relative concentration with age (p < 1e-16). Proteasome subunits and protein-folding chaperones exhibited the same overall trend (p < 1e-9 and p < 0.005, respectively). Energy storage proteins, including apolipoproteins and hexamerins, increased with age throughout the body but the trend was more dramatic in the hemolymph (p < 0.005). There were no signs of temporal regulation of enzymes for fatty acid synthesis, beta oxidation, and carbohydrate metabolism. However, several groups of energy producing proteins showed varying degrees of positive correlation with time: tricarboxylic acid cycle proteins (p < 0.05), ATP synthase subunits (p < 0.0005), and electron transport chain enzymes (p < 0.00005). Surprisingly, we observed a decreased expression of antioxidant proteins, members of the Ras GTPase superfamily, and ubiquitylation enzymes in the solid tissues as

Larval age Larval age
Developmental changes of larval hemolymph Figure 3 Developmental changes of larval hemolymph. The left axis denotes the volume of hemolymph per larva (diamonds; μl) or hemolymph protein concentration (squares; μg/μl), while the right axis describes the mass of total protein per larva (triangles; μg). Measurements were made by pooling 5-120 larva (n = 3 separate pools) depending on age (x-axis, in days) and size. (Error bars represent 2 standard deviations.) development progressed (p < 0.05, p < 0.01, and p < 0.05, respectively). Many typically intracellular proteins, such as ribosomal proteins and proteasome subunits, were found in hemolymph as we have described previously [3] and as others have reported in other insects [13,14].
We used hierarchical clustering to further analyze the 522 proteins that were profiled in either or both the solid tissue ( Figure 4a) and hemolymph (Figure 4b), followed by enrichment analysis according to manually assigned groupings. Clusters that satisfied the criteria (see Materials and methods) for significant enrichment are shown in Table 4 (complete dataset shown in supplementary Tables 7 and 8 in Additional data file 1 for tissue and hemolymph, respectively). Only a few functional classes of proteins were enriched in the same node, since expression profiles for some proteins exhibit biological variability that causes apparent inconsistency with the timecourse. Transcription and chromatin-associated proteins as well tRNA synthetases -clearly related by their tasks -shared node 376 (correlation 0.94) with pentose phosphate pathway and ubiquitylation enzymes. Energy storage and beta-oxidation proteins were both concentrated in node 434 (correlation 0.86, solid tissue). Protein turnover machinery, including ribosomes, protein folding, and proteasome, were all enriched in node 200 (correlation 0.84, hemolymph). Many of these clusters are also protein families already noted to show significant temporal regulation, such as energy storage proteins, ATP synthases, antioxidant proteins, and ubiquitylation enzymes. This indirectly suggests that suitable assignments were made during manual annota-tion and categorization, since their regulation patterns were grouped using independent methods.
Automated and semi-automated functional annotation and categorization effectively highlighted expression trends in large classes of proteins. With this approach, however, classes with only a few members or those where particular proteins have highly specialized function tended to fall below the significance threshold unless they were considered individually. In solid tissues, the levels of 86 proteins changed significantly (p < 0.05) over the tested period, accounting for 13% of all the quantifiable proteins in solid tissues. For example, levels of neuropeptide Y receptor increased 46-fold from day 3 to day 5. In the hemolymph, 66 of 481 (14%) quantified proteins changed significantly during the larval stage (p < 0.05). Most of these are intracellular proteins, yet the regulation of truly secreted proteins is frequently far more dramatic. An imaginal disc growth factor [GenBank:66514614] increased more than 13-fold from day 1 to day 5 ( To our knowledge this is the first proteome-level description of honey bee larval development, so to gain additional insight, we compared our data with a previously reported developmental study of the fruit fly. While Drosophila and Apis are separated by 300 million years of evolution [2], Drosophila is nonetheless the closest highly studied phylogenetic neighbor of the bee. A whole body transcriptome study of the Drosophila melanogaster life cycle was published in 2002 [15], which included a list of genes that were significantly regulated during the larval period. After finding the protein homologs common to our study and the fruit fly larval dataset (34 in total), we calculated the slope of linear regression of expression trends for both organisms (the slope of the honey bee tissue and hemolymph profiles were averaged when needed; see Materials and methods). Slope values that have opposite signs or an absolute difference in slope of greater than 0.75 were considered dissimilar, amounting to 38% (13 of 34) of the proteins considered (Table 5; complete dataset with BLAST homolog search results in supplementary Table 9 in Additional data file 1). The most extreme slope reported for both organisms is for the hexamerin 70b protein (1.5 for bees and 1.6 for flies).

Discussion
The data presented here, at the level of the whole proteome, documents the dramatic changes occurring in developing honey bee larvae. The most striking, by far, is the 1,500-fold increase in weight over just 6 days [16]. In our proteomic analysis of the solid tissue, the most abundant organs are best represented, namely the fat body (accounts for 65% of the mass [17]), followed by the midgut and larval tubules. The hemolymph fraction reflects the secretory activities of all these tissues and also the molecules associated with intercellular communication and regulation. The presence of intracellular proteins suggest that hemolymph plays a major role in clearing apoptotic cells, in line with observations of the equivalent connective tissue in mammals (that is, blood) [18]. No dissection of specific larval organs was performed because many do not develop until the late stages, making direct comparisons of organ development by quantitative proteomics impossible.
We have found both automated (BLAST2GO) and semi-automated annotation (manual selection of descriptions provided by automated tools and manual categorization) to be very valuable for maximizing available information on an organism with otherwise very little functional annotation. While automated ontological methods were reliable and bias-free, outputs might be too generic (for example, 'ion binding' [GO:0043167]), or failing to accurately represent several very important protein families of the honey bee (for example, hexamerins and odorant binding proteins), highlighting the need for manual intervention. Cluster analysis is an indispensable tool for spotting expression trends, but given that the software for rigorous statistical enrichment analysis is designed specifically for popular model organisms such as mouse, worm, and yeast, the descriptive statistical approach used here was nevertheless able to provide credible insights about larval developmental biology or led to conclusions confirmed by other information.
The major behaviour during the larval stage is feeding as it prepares itself for the subsequent pupal stage when no feeding occurs. Based on various data acquired over the past century, it has been proposed that the larval fat body undergoes two phases, beginning with a high rate of protein synthesis and poor uptake of hemolymph substances, followed by a phase of low cellular synthesis and improved uptake and storage of hemolymph proteins [19]. Our data now allows us to clarify this model and provide molecular-level detail of these changes. One of the most remarkable events in a growing larva is the substantial synthesis of hexamerins and lipoproteins in the fat body, followed by their appearance in the Proteins that were manually categorized by function were subjected to average-linkage clustering (Figure 3). Separate analyses were done for solid tissue and hemolymph. Only proteins that were quantified for at least 4 out of 5 days and protein classes that had at least 5 members were considered for enrichment analysis. Classes with at least 50% enrichment in nodes with a correlation of >0.8 are considered significant and shown. Average-linkage clustering of proteins quantified in the honey bee larvae Figure 4 Average-linkage clustering of proteins quantified in the honey bee larvae. Proteins that were quantified in either or both the (a) tissue or (b) hemolymph for at least four out of five tested days were arranged by hierarchical clustering using software described in [46]. All expression values, shown relative to day 3 (= 0, black), have been natural log-transformed (>0, red; <0, green; no data, grey). These proteins, which have been manually annotated with a function and category, are calculated for enrichment within a node (results in Table 3) if the node correlation value is >0.8 (see thick bar on scale). hemolymph near the end of this developmental stage (reviewed extensively in [20]). While the age-dependent production of these abundant storage proteins is well known, here we provide evidence of a concomitant up-regulation of low copy transmembrane transporters [GO:0022857] that may facilitate the export, including a porin [Gen-Bank:66521459]. Paradoxically, this astounding rate of protein production and export is paired with an opposite trend in protein synthesis machinery and accessories, which had been suspected in two reports in the 1960s [21,22]. Now we have evidence for these previous suggestions, including the clear age-associated decrease of more than 50 detected ribosomal subunits, coupled with an increase of two transcription repressors (although at p < 0.1 these did not satisfy significance criteria) to support this former notion.
Fat accumulation is an important purpose of the rapid larval growth, clearly indicated by the size of the fat body tissue relative to the whole organism, as well as the buildup of lipophorins. Lipids in larval food is only 4% by weight [23], meaning that de novo synthesis must account for the bulk of stored fat. Fatty acid synthase [GenBank:66515350] was one of the most abundant proteins throughout the entire tested period based on absolute protein expression estimates [24], yet to our surprise we did not observe significant temporal regulation in the expression of this enzyme with age. It is worth noting that 'fat body' is somewhat of a misnomer, given that it is involved in protein and glycogen storage, as well as fat [19,25]. To drive these endergonic biosynthetic processes, the demands for ATP must therefore be great. Not only do we observe significant age-associated increases in ATP synthase subunits, but also enzymes in energy-producing pathways such as the tricarboxylic acid cycle and the electron transport chain components. This may be attributed to an increase in mitochondria size or numbers; however, there are at least two reports that claim the number of mitochondria decreases as the larva approaches pupation in other insects [26,27].
Proteins with high copy number, including the many discussed above, are always the first to be investigated in any organism. The difficulties in studying proteins in honey bee larva have multiple sources: the abundant storage proteins broaden the dynamic concentration range, obscuring the rare proteins; the clean dissection of larval organs presents a technical challenge since the fat body is large and is difficult to remove; and finally, the lack of available antibodies against even the most common proteins makes many conventional biochemistry experiments, such as immunoprecipitation and western blotting, impossible. These reasons have especially hindered the study of fine larval organs such as the nervous system and low abundance proteins related to immunity or pathway regulation.
The ability of larvae to respond to external stimuli and internal regulatory cues increases with time, a trend that is clearly reflected in our data. For example, odorant-based communi-cation has been observed in old larvae [28,29]. Odorant binding protein 14 [GenBank:94158822] was detected even on the first day after hatching, showing upregulation with age (Figure 5b). This suggests that younger larvae may have the capa-Expression profiles of four selected proteins during larval development   Genes in a life-cycle transcriptomic analysis of D. melanogaster [15] were compared to honey bee larval proteomics data in this report by finding homologs common to these studies. Significant matches (see Materials and methods for criteria) were assessed by comparing the slope values calculated between days 1 and 4: a protein is marked 'same' if the sign of the slope was the same and had a difference no greater than 0.75; otherwise, they are marked as 'different'.
bility to bind certain odorant molecules, but whether that translates into pheromonal communication is entirely speculative. The positive temporal regulation of antimicrobial peptide apismin [GenBank:58585112] (Figure 5c) and the melanization enzyme prophenoloxidase [Gen-Bank:58585196] in the hemolymph, which have clear roles in defense [30][31][32][33], matches the observed susceptibility to diseases such as foulbroods of the young larvae, suggesting that one or both of these may be the factor responsible for successful defense against foulbroods in older larvae. However, a Clectin [GenBank:110750008] and a complement factor [Gen-Bank:66508940] actually have no observable expression trends, indicating that they may have alternative roles different from homology-based function predictions. The 46-fold increase of a neuropeptide Y receptor [GenBank:110764421], which controls appetite and fat storage, is reasonable given the feeding activity of the larvae. An imaginal disc growth factor [GenBank:66514614] increased by 40-fold over the course of the experiment (Figure 5a) presumably gears the larva for pupal development where specific limbs and organs will grow from imaginal discs containing highly differentiated cells.
Proteomics is generally a discovery method and is thus an excellent mechanism for hypothesis generation. We were able to find several peculiar proteins supported by a number of high quality mass spectra but no plausible explanation for its presence or degree of age-dependent regulation. A protein annotated as 'PREDICTED: similar to CG15040-PA' [Gen-Bank:110749732] was consistently found only in the hemolymph of older larvae (up to 24-fold higher in 5-day old compared to 3-day old larvae), yet it has no likely homologs or discernable functional domains, bearing only a vague resemblance to a protein [GenBank:124512744] from Plasmodium falciparum 3D7, found by PSI-BLAST [34,35].

Conclusion
To study honey bees, individual, environmental, and social factors must be considered. The larval developmental stage has been shown to be a highly complex period of biochemical regulation. The proteomics data presented here are able to support a model for energy metabolism and storage, as well as reveal unexpected expression trends for proteins that respond to external and internal stimulus, such as pheromones, pathogens, and oxidants.

Obtaining larvae of known ages
Honey bee (A. mellifera ligustica) larvae were obtained from colonies kept at the University of British Columbia, Vancouver, BC, Canada. Samples were collected in the summer and early autumn. To acquire larvae of known ages, a queen was isolated on an empty frame of dark comb bracketed by two frames approximately 50% filled with honey and pollen for 16 h inside a nucleus colony with several hundred worker bees. The brood frame with newly laid eggs was then replaced into the original hive, along with the queen, workers and two supporting frames. The queen was separated from the newly laid eggs using a queen excluder to prevent additional eggs from being deposited. Three days after reintroducing the eggs into the colonies, larvae were collected for five consecutive days. In this system the maximum error in larval age would be 16 h. Empirical testing with shorter times did not yield enough eggs/larvae to sample the same population over all five days of development. Before proceeding with protein collection, all larvae were washed three times in phosphate buffered saline to reduce royal jelly contamination.

Protein collection
For 1-to 3-day old larvae, hemolymph was collected by piercing the larval skin, taking care not to cause organ damage by avoiding deep cuts. For 4-and 5-day old larvae, hemolymph was collected by inserting a disposable 5 μl glass Microcap pipette two-thirds of the way down one side of the larva, drawing liquid by capillary action. All hemolymph samples were centrifuged for 10 minutes at 16,100 relative centrifugal force (r.c.f.) at 4°C to pellet cells and debris, which were added to the tissue samples. The tissues were homogenized by a bead mill using a tungsten bead in each 2 ml self-locking tube (Eppendorf, Hamburg, Germany) at 30 Hz for 5 minutes in 50 μl of phosphate buffered saline containing a protease inhibitor cocktail tablet solution (Roche) at 8 times the suggested concentration. Lysis buffer (100 μl of NP-40, and so on) was added before the sample was homogenized by 10 strokes through a syringe tipped with a 25 G needle. The sample was clarified for 10 minutes at 16,100 r.c.f. at 4°C and the pelleted debris was discarded. The Coomassie Plus Protein Assay (Pierce, Rockford, IL, USA) was used to determine protein concentrations of the tissue lysates and the clarified hemolymph according to the manufacturer's instructions before they were stored at -20°C until used.

Denaturing protein gel electrophoresis
Tissue and hemolymph proteins were resolved on precast (Invitrogen) 4-12% NuPAGE gels (30 μg/lane) in reducing conditions with MES buffer according to the manufacturer's instructions. Blue-silver stain [36] was used to visualize protein bands.

Sample preparation for mass spectrometry analysis
Larval tissue or hemolymph protein were aliquoted to provide 20 μg per sample before they were precipitated using the ethanol/acetate method as described [37]. The insoluble proteins were pelleted and temporarily stored at 4°C after a 10minute centrifugation at 16,100 r.c.f. The ethanol supernatant was vacuum-dried, solubilized in sample buffer (3% acetonitrile, 1% trifluoroacetic acid, 0.5% acetic acid), and purified using the C 8 flavor of STop And Go Extraction (STAGE) tips [38] to remove contaminants such as lipids, nucleic acids, and protease inhibitors. Bound proteins were eluted using 100% acetonitrile and vacuum-dried before adding 0.5 μl of 1.5 M Tris-HCl, pH 8.8. The bulk protein pellet and C 8 purified proteins were digested using LysC and trypsin as described [37]. Peptides were desalted using C 18 STAGE tips and the eluted solution was dried by vacuum centrifugation. For proteome profiling by relative quantification, binary analysis between timepoints was performed by chemical dimethylation of peptides from different timepoints using either light (CH 2 O) or heavy (CD 2 O) isotopologues of formaldehyde. For both the hemolymph and tissue samples, 3-day old larvae were used as a reference for all other timepoints, such that their peptides were always labeled with the opposing form of formaldehyde from days 1, 2, 4, and 5 before mixing the differentially labeled samples. Samples were fractionated on C 18 -SCX-C 18 STAGE tips using a 10-step ammonium acetate elution gradient [39] and dried peptide samples were resuspended in 1% trifluoroacetic acid, 3% acetonitrile, 0.5% acetic acid prior to analysis on a linear trapping quadrupole-Orbitrap hybrid mass spectrometry (ThermoFisher Scientific, Waltham, MA, USA) as described in [3].

Raw data processing
Following liquid chromatography-mass spectrometry analysis, peak lists were extracted from the raw data using Extract_MSN.exe (ThermoFisher Scientific) and DTA Supercharge [40] as described [41]. Results were searched using Mascot (v2.2) against a database containing the protein sequences of: Honey Bee Official Gene Set 1 [42], common exogenous contaminants (human and sheep keratins) and additives (porcine trypsin, lysyl endopeptidase C), the polyprotein of the deformed wing virus (common and often asymptomatic [43]), and the reversed sequences of all of the above as a decoy for reporting false discovery rates. The following Mascot parameters were used: trypsin (allowing up to one missed cleavage) or no enzyme specificity (in separate searches); carbamidomethyl as a fixed modification, variable modifications of dimethylation by both hydrogen isotopes at the peptides' amino termini and lysine ε-amino groups, 10 ppm peptide tolerance; 0.8 Da tandem mass spectrometry tolerance, and electrospray ionization-Trap fragmentation characteristics. Results were saved in Peptide Summary format with the 'Require Bold Red' option checked, applying a score cutoff corresponding to p < 0.05, which is 27 where results were limited to tryptic peptides, and 47 where no enzymes were specified. Since each sample was fractioned, generated files were combined using the in-house script Pickletrimmer.pl. MSQuant [40] was used to semi-automatically extract chromatographic peak volumes in both the light and heavy isotopologues of each detected peptide. Only peptides with an absolute calibrated mass error of <5 ppm were considered further. For protein quantification, the file was parsed (in-house script: QC_msqfa.pl) to obtain natural logarithm (Ln)-transformed heavy/light peptide volume ratios, which were median-normalized before they were averaged to calculate a relative protein ratio of day 3 larvae/day × larvae (where x = 1, 2, 4, or 5). From the three biological replicates of each binary comparison, proteins quantified with at least two quantified peptides from two or more replicates were averaged. Proteins whose relative quantities could be tracked for at least 4 of 5 days in either the tissue or hemolymph were considered to be profiled. For protein identification, the above peptides and unquantified sequences were extracted from MSQuant outputs. After removing redundant entries (in-house script: QC_remduplicate.pl), each was matched to their respective protein (in-house script: finalist.pl), excluding hits that were verified by equal to or less than two peptides of at least six or more residues. The false discovery rate was estimated by dividing the sequence-reversed proteins that failed to be eliminated after applying the above criteria.

Automated protein annotation to Gene Ontology terms
All identified proteins were matched to GO [44] terms using BLAST2GO [12], following their standard procedure of performing BLAST searches for each protein (BLASTp, nr database, high scoring segment pair (HSP) cutoff length 33, report 20 hits, maximum e-value 1e-10), followed by mapping and annotation (e-value hit filter 1e-10, annotation cutoff 55, GO weight 5, HSP-hit coverage cutoff 20). After generating a directed acyclic graph (sequence filter 2, score alpha 0.6, node score filter 0) of molecular function terms (not shown), which groups specific terms into broader categories, ontologies on the third level of this graph were further analyzed by statistical testing (see below). The term 'protein binding' [GO:0005515] was omitted because this included the most number of proteins, most of which belonged under another more informative term.

Semi-automated protein annotation and manual categorization
Protein descriptions were taken from several sources or tools, all of which are sequence homology-based derivations. Offi-cial protein names given in the Official Gene Set 1 [42] were used if the name was informative. BLAST2GO-derived descriptions were used where protein function was not clear from the official name (for example 'hypothetical protein'). If an appropriate name was still not derived, searches against the Conserved Domain Database (NCBI) were performed and considered matched for e-values <1e-10. As a final measure for matching a protein with a functional name, proteins descriptions were copied from Blast2seq [45] results (accessed via BLink in NCBI) if matches had >25% sequence identity and an e-value of <1e-10 over the aligned region. If none of these steps provided useful information, the protein was labeled and categorized with 'unknown function.' Proteins with descriptions but that did not fit under a specific category were classified as 'uncategorized' (supplementary Table  5 in Additional data file 1). Proteins that were not manually annotated are marked with 'NA' in the 'Description' column of supplementary Table 1 in Additional data file 1.

Statistical analysis
To each class of manually assigned proteins, a pairwise, twotailed t-test was performed using each protein in that class by taking the relative ratio in day 1 and comparing to day 5. Groups with p < 0.05 were considered to be temporally regulated, and their directionality of regulation was calculated by averaging the slopes of individual proteins within a group using day 1 and day 5 timepoints. Third-level GO molecular function terms were analyzed in the same manner, except all the proteins considered were quantified over all five days tested in at least one of the tissue or hemolymph datasets. To individual proteins, the same criteria for significance was used, taking values from each biological replicate as a data point in a pairwise comparison between the earliest and latest day the protein was quantified. We also performed average linkage clustering of the protein expression levels for proteins that were quantified over at least 4 days in either the tissue or hemolymph using Cluster and visualized by Treeview [46]. The grouping sizes ranged from 2 to 55 proteins. To normalize this variation, the number of proteins in a given class is reported as a percentage of the total class size (percent enrichment, using in-house script QC_nodeenrichment.pl). Only nodes that included at least 50% of all the proteins in that class and had a Pearson's correlation coefficient of greater than 0.8 were considered to be within the same cluster. Protein families with three or fewer members were included as part of the tree diagram, but were not considered for whether they formed a significant cluster.

Comparison to Drosophila
Proteomic profiles resulting from this work were compared to the transcriptomic profiles of previously published Drosophila homologs [15] for the timepoints matching most closely to days 1 to 4 of the honey bee larval stage (h = 24, 49, 72, 96) for genes that were significantly regulated over this period (the fruit fly larval stage is shorter than that of bee by 1 to 2 days). BLASTp was used to find homologs in the Honey Bee Official Gene Set 1, which were defined as matches having e-values <1e-10 with at least 25% identity within the aligned region. Timepoints of the Drosophila dataset were normalized to the h = 72 timepoint and Ln transformed. To compare the expression trend between the two organisms, the slope of the line-of-best-fit for proteins (bees) or genes (flies) was calculated: expression trends with slopes that differed in signage or had an absolute difference of greater than 0.75 were considered to be dissimilar. Slopes whose absolute Pearson's correlation coefficient value was <0.5 were considered insignificant and, therefore, not considered. In instances where a significant slope could be calculated for a protein in both the tissue and hemolymph samples, the slopes were averaged.