Sequence and expression characterization of the transcripts with and without detected homologs. (a) Length and longest ORF length statistics. (b) Percentage of transcripts with known protein domains. (c) Distribution of GC-content. (d) Potential to be protein coding. (e) Distribution of the median expression across all the castes. (f) Codon-usage frequencies. RPKM, reads per kilobase per million.