Transcriptomic signatures shaped by cell proportions shed light on comparative developmental biology

Background Comparative transcriptomics can answer many questions in developmental and evolutionary developmental biology. Most transcriptomic studies start by showing global patterns of variation in transcriptomes that differ between species or organs through developmental time. However, little is known about the kinds of expression differences that shape these patterns. Results We compared transcriptomes during the development of two morphologically distinct serial organs, the upper and lower first molars of the mouse. We found that these two types of teeth largely share the same gene expression dynamics but that three major transcriptomic signatures distinguish them, all of which are shaped by differences in the relative abundance of different cell types. First, lower/upper molar differences are maintained throughout morphogenesis and stem from differences in the relative abundance of mesenchyme and from constant differences in gene expression within tissues. Second, there are clear time-shift differences in the transcriptomes of the two molars related to cusp tissue abundance. Third, the transcriptomes differ most during early-mid crown morphogenesis, corresponding to exaggerated morphogenetic processes in the upper molar involving fewer mitotic cells but more migrating cells. From these findings, we formulate hypotheses about the mechanisms enabling the two molars to reach different phenotypes. We also successfully applied our approach to forelimb and hindlimb development. Conclusions Gene expression in a complex tissue reflects not only transcriptional regulation but also abundance of different cell types. This knowledge provides valuable insights into the cellular processes underpinning differences in organ development. Our approach should be applicable to most comparative developmental contexts. Electronic supplementary material The online version of this article (doi:10.1186/s13059-017-1157-7) contains supplementary material, which is available to authorized users.

lower cluster 5 (n=2645) lower cluster 4 (n=3038) lower cluster 3 (n=1570) lower cluster 2 (n=1597) lower cluster 1 (n=1544) lower cluster 10 (n=1397) lower cluster 9 (n=1432) lower cluster 8 (n=1117) lower cluster 7 (n=1309) lower cluster 6 (n=1557) Model of cusp expansion in the lower and upper molars were build using in situ data on the timing of cusp patterning. Once patterned, the territory of each cusp expands at the same speed. In the schematic representation, the average shade of grey of each tooth, at each developmental stage, gives a visual impression on the average degree of the expansion of the cusp territory, from unpatterned cusp (white) to just-patterned cusp (light grey) and fully expanded cusp (black). We computed cusp expansion in each tooth based on this model, 0 corresponding to no cusp patterned (all white), and 1 to a complete expansion (all black), which we will later refer to as "cusp tissue" proportion from model. upper/lower ratio < 1 (that is, upper having a slower rate than lower) fit less well to the PCA.
We also assessed each model through the correlation between the heterochrony estimated by the model and the heterochrony estimated by the PCA1 (Right). It is obvious that the 9 heterochrony is best recapitulated by models with equals rates between upper and lower molars, or by models with slower maturation in lower molar. From the middle and the right panel taken together, it can be deduced that the best model(s) have roughtly equals maturation rates in upper and lower molars. Two examples are taken.
Left: On the first example (tissue proportion, left) two samples are compared, for which cellular expression levels are the same (schematized here by the same pink and blue colors in both samples), but tissue proportions differ between samples (here, 50% epithelium in sample1, 25% epithelium in sample 2). Because of these differences in tissue proportions, the expression level of tissue-specific genes (like gene B, A1, A2 here) will be different between sample 1 and sample 2. The expression of gene A1, for instance, appears to be different when normalization is performed on the whole dataset, but is equal when normalization is done on tissue-specific genes.
Right: On the second example (right), both tissue proportions and cellular expression levels differ between stages (schematized here by a different blue color in the samples). The expression of gene A1, for instance, remains different when normalisation is done on tissuespecific genes.
(b) Heterochrony measured on PCA drawn using the whole dataset (total), and with 3 subsets of genes, each specific to one of 3 tissue compartments (336 epithelium markers, 421 EK markers and 566 mesenchyme markers). Normalization was done on the whole transcriptome (left) or on each set of markers (right, normalization on tissue-specific genes).
For each dataset, the first axis ordered the samples with their developmental advancement.
The height of each bar represents the heterochrony measured on the first PCA axis in each case (that is, coordinate of lower sample on PCA1 -coordinate of upper sample on PCA1, for each time point). Timepoints are taken at 9.5 (forelimb), 10.5, 11.5, 12.5 and 13.5 (fore and hindlimb) days of gestation (see supplementary text below for details).

Supplementary Text
To see whether our findings can be generalized to other serial organs, we re-analysed the most comprehensive published transcriptome dataset on fore/hind limb development, with expression profiled by microarrays in many replicates for 20737 mouse genes at 4/5 stages of limb development (Taher et al. 2011). The PCA analysis of this dataset is clearly revealing a temporal signal on the first axis (PCA1 = 31.5% of the global variation; Fig 7C). We tested the significance of this temporal signal by a linear model (p-value < 10-16). The addition of 13 the type of organ further improves the fit of the model (p-value = 2*10-05). This is demonstrating that heterochrony between fore and hind limb is visible from the transcriptomes, forelimbs being advanced compared to hind limbs. This is a well-known characteristics of mouse limb development.