Skip to main content

Table 1 Comparison of the P. euphratica unigene collection with other sequence collections from whole genomes or EST projects

From: Gene expression and metabolite profiling of Populus euphratica growing in the Negev desert

Sequence collection Matches Unique  
All    7,841
Populus genome 7,671 763  
Arabidopsis genome 5,434 2  
Rice genome 1,562 0  
Populus EST sequence 5,780 5  
Rosid EST sequence 4,597 1  
Asterid EST sequence 3,490 4  
Caryophyllid EST sequence 2,081 0  
Monocot sequence 2,135 3  
GenBank sequence 5,495 0  
Short sequences 275 20  
Low protein coding potential 728 28  
Remainder    54
  1. All P. euphratica unigenes were compared against reference sequence collections to investigate sequence overlap and to identify the number of sequences unique to this sequence collection. The reference sequence collections include the draft Populus genome, the Arabidopsis thaliana genome, the rice genomes and pooled collections of openSputnik EST collections representing large collections from species taxonomically assigned to the plant groups of rosid, asterid, caryophyllid and monocot. Also included in the reference sets are the sequences having a match to an annotated protein in the UniProt database or P. euphratica sequences that are either short (less than 100 nucleotides) or have a low protein coding potential (less than 25% protein coding). In the table, the reference sequence collection is displayed along with the number of P. euphratica sequences that can be matches to the reference sequence collection and the number of sequences that are unique to this sequence collection. All blast analyses were performed using an arbitrary expectation value of 1e-10. The remainder (54) represents the number of sequences that have no match within any of the challenge datasets and may thus represent P. euphratica specific genes.